
L 

Number 


Hits 


Search Text 


DB 


Time stamp 


1 


482 


heap and pile 


US PAT; 


2003/04/01 








EPO; JPO; 


10:40 








IBM TDB 




2 


0 


heap and pile and identif4 with root 


USPAT; 


2003/04/01 








EPO; JPO; 


10:20 








IBM TDB 




3 


0 


heap and pile and identif$4 with root 


USPAT; 


2003/04/01 








EPO; JPO; 


10:20 








IBM TDB 




4 


0 


heap and pile and identif$4 same root 


USPAT; 


2003/04/01 








EPO; JPO; 


10:21 








IBM TDB 




5 


2 


heap and pile and identif$4 same (root 


USPAT; 


2003/04/01 






parent) 


EPO; JPO; 


10:21 








IBM TDB 




6 


0 


heap and pile and identify 4 same (root 


USPAT; 


2003/04/01 






parent) and unused 


EPO; JPO; 


10:21 








IBM TDB 




7 


0 


heap and pile and identif$4 same (root 


USPAT; 


2003/04/01 






parent) and un$used 


EPO; JPO; 


10:22 








IBM TDB 




8 


23170 


add with operation 


USPAT; 


2003/04/01 








EPO; JPO; 


10:22 








IBM TDB 




9 


160 


add with operation and root with leaf 


USPAT; 


2003/04/01 








EPO; JPO; 


10:23 








IBM TDB 




10 


19 


add with operation and root with leaf and 


USPAT; 


2003/04/01 






identif$4 with root 


EPO; JPO; 


10:23 








IBM TDB 




11 


3 


add with operation and root with leaf and 


USPAT; 


2003/04/01 






identif$4 with root and un$used 


EPO; JPO; 


10:24 








IBM TDB 




12 


3 


add with operation and root with leaf and 


USPAT; 


2003/04/01 






identif$4 with root and (un$used "not 


EPO; JPO; 


10:24 






used") 


IBM TDB 




13 


17 


add with operation and root with leaf and 


USPAT; 


2003/04/01 






identif$4 with root and (un$used "not 


EPO; JPO; 


10:25 






used" available) 


IBM TDB 




14 


2 


add with operation and root with leaf and 


USPAT; 


2003/04/01 






identif$4 with root and (un$used "not 


EPO; JPO; 


10:25 






used" available) with leaf 


IBM TDB 




15 


2 


add with operation and root with leaf and 


USPAT; 


2003/04/01 






identif$4 with root and {un$used "not 


EPO; JPO; 


10:26 






used" available) with leaf and travers$4 


IBM TDB 




17 


0 


add with operation and root with leaf and 


USPAT; 


2003/04/01 






identif$4 with root and (un$used "not 


EPO; JPO; 


10:27 






used" available) with leaf and travers$4 


IBMJTDB 








with root with leaf and heap 






18 


0 


add with operation and root with leaf and 


USPAT; 


2003/04/01 






identif$4 with root and (un$used "not 


EPO; JPO; 


10:27 






used" available) with leaf and travers$4 


IBM_TDB 








with root with leaf and pile 






19 


0 


add with operation and root with leaf and 


USPAT; 


2003/04/01 






identif$4 with root and {un$used "not 


EPO; JPO; 


10:28 






used" available) with leaf and travers$4 


IBMJTDB 








with root with leaf and public adj domain 






20 


0 


add with operation and root with leaf and 


USPAT; 


2003/04/01 






identif$4 with root and (un$used "not 


EPO; JPO; 


10:28 






used" available) with leaf and travers$4 


IBMJTDB 








with root with leaf and data adj 










structure 






16 


2 


add with operation and root with leaf and 


USPAT; 


2003/04/01 






identif$4 with root and (un$used "not 


EPO; JPO; 


10:32 






used" available) with leaf and travers$4 


IBM TDB 








with root with leaf 







Search History 4/1/03 12:55:55 PM 
C: \APPS\EAST\Workspaces\93-Nadj .wsp 



Page 1 



21 


2 


acid with operation and (root parent) with 


US PAT ; 


2003/04/01 






(leaf child$4) and identif$4 with (root 


EPO; JPO; 


10:33 






parent) and (un$used "not used" 


IBM_TDB 








available) with (leaf child$4) and 










travers$4 with root with leaf 






22 


16 


(root parent) with (leaf child$4) and 


US PAT; 


2003/04/01 






identif$4 with (root parent) and (un$used 


EPO; JPO; 


10:33 






"not used" available) with (leaf child$4) 


IBMJTDB 








and travers$4 with root with leaf 






24 


0 


(root parent) with (leaf child$4) and 


US PAT; 


2003/04/01 






identif$4 with (root parent) and (un$used 


EPO; JPO; 


10:35 






"not used" available) with (leaf child$4) 


IBM_TDB 








and travers$4 with root with leaf and 










add$4 and heap 






23 


16 


(root parent) with (leaf child$4) and 


US PAT; 


2003/04/01 






identif$4 with (root parent) and (un$used 


EPO; JPO; 


11:16 






"not used" available) with (leaf child$4) 


IBM_TDB 








and travers$4 with root with leaf and 










add$4 






25 


0 


heap and pile and identif$4 with (root 


US PAT; 


2003/04/01 






parent) and (available un$used "not 


EPO; JPO; 


10:42 






used") with (node leaf child$4) 


IBM TDB 




26 


38 


heap and pile and travers$4 


US PAT; 


2003/04/01 








EPO; JPO; 


10:42 








IBM TDB 




27 


0 


heap and pile and travers$4 and (root 


US PAT; 


2003/04/01 






parent) and (leaf child) with (available 


EPO; JPO; 


10:42 






unused) 


IBM TDB 




28 


0 


707/$. eels . and heap and pile and 


US PAT; 


2003/04/01 






travers$4 


EPO; JPO; 


10:43 








IBM TDB 




29 


10697 


remov$4 adj operation 


US PAT; 


2003/04/01 








EPO; JPO; 


11:17 








IBM TDB 




30 


203 


remov$4 adj operation and data adj 


US PAT; 


2003/04/01 






structure 


EPO; JPO; 


11:18 








IBM TDB 




31 


2 


remov$4 adj operation and data adj 


US PAT; 


2003/04/01 






structure and remov$4 with root with 


EPO; JPO; 


11:19 






value 


IBM TDB 





Search History 4/1/03 12:55:55 PM Page 2 
C: \APPS\EAST\Workspaces\93-Nadj .wsp 



United States Patent [i 9 ] 

Demers et al. 



US006105018A 
[ii] Patent Number: 
[45] Date of Patent: 



6,105,018 
Aug, 15, 2000 



[54] MINIMUM LEAF SPANNING TREE 



OTHER PUBLICATIONS 



[75] Inventors: Alan Demers, Boulder Creek; Alan 
Downing, Fremont, both of Calif. 

[73] Assignee: Oracle Corporation, Redwood Shores, 
Calif. 



[21] Appl. No.: 09/049,285 

[22] Filed: Mar. 26, 1998 | ! 

Int. CI. 7 G06F 17/30 

U.S. CI 707/2; 707/5; 707/4; 707/3; 



[51] 
[52] 



707/1 

[58] Field of Search 707/2, 5, 4, 3, 

707/1 

[56] References Cited 

U.S. PATENT DOCUMENTS 

5,701,467 12/1997 Freeston 707/100 

5,781,906 7/1998 Aggarwal et al 707/102 

6,006,233 12/1999 Schultz 707/101 



CRC Dictionary of of Computer Science, Engineering and 
Technology, Mar. 2000. 

Primary Examiner — Thomas G. Black 

Assistant Examiner — -William Trinh 

Attorney, Agent, or Firm — McDermott, Will & Emery 



[57] 



ABSTRACT 



An efficient set of indexes to cover a plurality of anticipated 
query types is determined by building a directed acyclic 
graph whose nodes correspond to anticipated query types. A 
minimum leaf spanning tree for the equivalent graph is 
determined by repeatedly finding an augmenting path for a 
current spanning tree and producing a reduced leaf spanning 
tree based on the current spanning tree and the augmenting 
path until an augmenting path can no longer be found. The 
leaves of the minimum leaf spanning tree indicate which 
indexes should be built. 

26 Claims, 12 Drawing Sheets 




04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Aug. 15, 2000 Sheet 1 of 12 



6,105,018 




04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Aug. 15, 2000 Sheet 2 of 12 6,105,018 




FIG. 2 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Aug. 15, 2000 Sheet 3 of 12 6,105,018 




FIG. 3 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Aug. 15, 2000 Sheet 4 of 12 



6,105,018 



r 

Build a directed, 
acyclic graph (DAG) 
equivalent to column 
combinations 



400 



402 



Find a minimum leaf 
spanning tree for 
the DAG 



L 



404 



Build indexes based 
on leaf nodes of the 
minimum leaf 
spanning tree 



FIG. 4 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Aug. 15, 2000 



Sheet 5 of 12 



6,105,018 



500- 



NODE 


200 


COL 


{} 


EDGES 




PARENT 




MARK 


0 



510- 



520 Nnode" 



COLS 



EDGES 



PARENT 



MARK 



530- 



540-W 



550- 



r 



501 



210 



502 



220 



r 



503 



504 



230 



240 



505 



250 



NODE 


210 


COLS 


{a} 


EDGES 




PARENT 


200 


MARK 


0 



r 



514 



220 



{C} 



200 



0 



NODE 


230 


COLS 


{be} 


EDGES 




PARENT 


220 


MARK 


0 




NODE 


240 


COLS 


{a b c} 


EDGES 




PARENT 


230 


MARK 


0 



NODE 


250 


COLS 


{abed} 


EDGES 




PARENT 


230 


MARK 


0 



240 



515 



250 



523 



230 



524 



240 



25 



250 



534 



240 



r. 



535 



250 



/ *-545 



250 



FIG. 5 



04/01/2003, 



EAST Version: 1.03.0002 



U.S. Patent Aug. is, 2000 



Sheet 6 of 12 



6,105,018 




Find an initial spanning 
tree for DAG 



602 



Set current spanning 
tree to initial spanning 



604 



Find an augmenting 

path for current 
spanning tree in DAG, 
if it exists 




Adjust edges of 
current spanning tree 
based on augmenting 
path 



NO 



610 
-/-n 



Establish current 
spanning tree as a 
minimum spanning tree 



FIG. 6 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Aug. 15, 2000 



Sheet 7 of 12 



6,105,018 




FIG. 7(a) 




FIG. 7(b) 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Aug. IS, 2000 Sheet 8 of 12 6,105,018 




Determine whether 
there is an augmenting 
path starting from the 
current node 




FIG. 8(a) 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Aug. 15, 2000 Sheet 9 of 12 6,105,018 



C(FR0M 804 orN 
^ 822) ^ 




augmenting path starting 
from the other child node 

^ ' FIG. 8(b) 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Aug. 15, 2000 



Sheet 10 of 12 




FIG. 9 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Aug. 15, 2000 



Sheet 11 of 12 



6,105,018 



1010 



1020 



1022 



1024 



1026 



1030 
1032 
1034 
1036 
1038 



(ROWID) 


A 


B 


C 


D 


1 


1 


1 


1 


2 


2 


3 


7 


2 


8 


3 


2 


4 


3 


6 


4 


4 


2 


4 


4 


5 


7 


8 


5 


1 



TABLE T1 
1Q0Q 



FIG. 10 
(PRIOR ART) 



1202 



KEYVALUE 


(ROWID) 


1.1 


1 


2,7 


2 


3,4 


3 


4,2 


4 


5.8 


5 



INDEX ON C, B 



FIG. 12 

(PRIOR ART) 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Aug. 15, 2000 



Sheet 12 of 12 



6,105,018 



1110 



1120 



1130- 
1132- 
1134- 
1136- 
1138- 



KEYVALUE 


(ROWID) 


1 


1 


2 


3 


3 


2 


4 


4 


7 


5 



INDEX ON A 
110Q 

FIG. 11(a) 
(PRIOR ART) 



1100 



1160 



1...2 



1 .. 


.7 J 










CO 


..4 



v-1150 



-1162 



1164 



5... 7 







^1172 


^1174 


f' 

;1170 ; ^ 


3:2 




4:4 


' H176 



FIG. 11(b) 
(PRIOR ART) 



04/01/2003, EAST Version: 1.03.0002 



6,1C 

1 

MINIMUM LEAF SPANNING TREE 

FIELD OF THE INVENTION 

The present invention relates to computer database sys- 
tems and more particularly to efficiently executing a query 
in a database. 

BACKGROUND OF THE INVENTION 

In a relational database, information is stored in indexed 
tables. A user retrieves information from the tables by 
entering input that is converted to queries by a database 
application. The database application submits the queries to 
a database server. In response to a query, the database server 
accesses the tables specified in the query to determine which 
information within the tables satisfies the queries. The 
information that satisfies the queries is then retrieved by the 
database server and transmitted to the database application 
and ultimately to the user. Queries may also be internally 
generated and executed by a database system for performing 
administrative operations. 

For any given database application , the queries must 
conform to the rules of a particular query language. Most 
query languages provide users with a variety of ways to 
specify information to be retrieved. For example, in the 
Structured Query Language (SQL), the following query 
requests the retrieval of the information contained in all 
columns of rows in table Tl in which the value of column 
a is 2: 

Query 1 

Select * from Tl 
where a-2 

Table Tl (1000) is shown in FIG. 10 and comprises four 
user columns, 1020-1026, and five rows (1030-1038). Table 
1000 also has an internal column 1010, or pseudocolumn, 
referred to as rowid. A table's rowid pseudocolumn is not 
displayed when the structure of the table is listed. However, 
the rowid is retrievable by query and uniquely identifies a 
row in the table. Rowid pseudocolumn 1010 has rowid 
entries that correspond to rows 1030-1038. Thus, a rowid of 
2 for table 1000 specifies row 1032 and no other row of table 
1000. Columns 1020-1026 each store data, in this example 
numbers, and each column has a name. The name of column 
1020 is a and the names of columns 1022, 1024, and 1026 
are b, c, and d, respectively. 

Without special processing, a database server would have 
to fetch every row of a table and inspect every column 
named in the where clause to perform the query. Such an 
approach, however, impairs the overall database system 
performance because many disk blocks would have to be 
read. Consequently, many database systems provide indexes 
to increase the speed of the data retrieval process. Adatabase 
index is similar to a normal index found at the end of a book, 
in that both kinds of indexes comprise an ordered list of 
information accompanied with the location of the informa- 
tion. Values in one or more columns are stored in an index, 
maintained separately from the actual database table. 

In FIG. 11(a), index 1100 is an index built on column a of 
table 1000. Each entry 1130-1138 in index 1100 has a key 
value 1110 and a rowid 1120. Since the key values are 
ordered, it can quickly be determined, for example, that the 
row having a key value of 2 in column a is associated with 
rowid 3 (see index entry 1132). An index may be imple- 
mented in a variety of ways well known in the art, such as 
with B-trees, depending on the specific performance char- 
acteristics desired for the database system. As changes are 



15,018 

2 

made to the table upon which an index is built, the index 
must be updated to reflect the changes. 

FIG. 11(6) shows a B-tree implementation of index 1100. 
A B-tree consists of a set of nodes connected in a hierar- 

5 chical arrangement. A B-tree contains two types of nodes: 
branch nodes and leaf nodes. Leaf nodes reside at the lowest 
level of the hierarchy and contain values from the actual 
column to which the index corresponds. For example, B-tree 
1100 is an index for column a 1020 of table 1000 and has 

10 leaf nodes 1172 and 1174. Node 1174 is a leaf node that 
contains a value from column a 1020. Along with the values, 
leaf nodes store the rowid of the rows that contain the values. 
For example, in addition to the number 3, leaf node 1172 
contains the rowid 2 which corresponds to the row 1032 of 

15 table 1000 that contains the number 3 in column 1020. In 
other words, leaf node 1172 contains index entry 1134, and 
a leaf node may contain more than one index entry. 

All the nodes in B-tree 1100 that are not leaf nodes are 
branch nodes. Branch nodes contain information that indi- 

20 cates a range of values. In the illustrated B-tree 1100, nodes 
1150, 1160, 1162, and 1164 are branch nodes and thus 
correspond to a range of values. The range of values 
identified in each branch node is such that all nodes that 
reside below a given branch node correspond to values that 

25 fall within the range of values represented by the branch 
node. For example, node 1162 is a branch node that corre- 
sponds to numbers in the numerical range from three to four. 
Consequently, nodes 1172 and 1174, which all reside below 
node 1162 in the hierarchy, correspond to values that fall 

30 within the range from 4 to 6. Reference numbers 1170 and 
1176 represent connections to other portions of B-tree 1100 
that are not shown. 

A database server can use index 1100 to process the 
exemplary query listed above because index 1100 is built on 

35 a column referenced in one of the predicates of the where 
clause. Specifically, the where clause contains the predicate 
a-2, and index 1100 is built on column a. Not all indexes 
built upon a table are useful for executing an arbitrary query. 
For example, the following query may be executed for table 

40 Tl 1000: 

Query 2 

select * from Tl 
where b=2 and c=4 

45 In this case, using index 1100, built upon column a 1020, 
does not aid in retrieving data for QUERY 2 because column 
a is not one of the columns referenced in the where clause. 
On the other hand, if an index is built upon column b 1022, 
column c 1024, or both, then the performance of data 

50 retrieval operations for QUERY 2 can be improved. In 
particular, a "multi-column index*' may be built on more 
than one column of a table; for example index 1200, 
illustrated in FIG. 12, is built upon columns c 1024 and b 
1022. The key value of a multi-column index is a concat- 

55 enation of column values from the table upon which the 
multi-column index was built. For example, in FIG. 12, the 
key value for entry 1202 lists a value of 4 taken from column 
c 1024 of table Tl 1000 followed by a value of 2 from 
column b 1022 of table 1000. This key value identifies row 

60 1036 by means of rowid 4. Thus, multi-column index 1200 
can be used in processing QUERY 2, because it was built 
upon both columns referenced in the query. 

One property of a multi-column index is that it improves 
data processing for "point lookup" queries referencing the 

65 first (n^l) columns upon which the multi-column index is 
built. In contrast to a "range lookup/' a point lookup 
identifies a row or set of rows by specifying a specific value 



04/01/2003, EAST Version: 1.03.0002 



6,105,018 

3 4 

for one or more columns. Thus, the search criteria associated of the table's columns. A minimum leaf spanning tree for the 

with point lookups includes an equality operator, but not graph is found and indexes are created for the table based on 

an inequality operator (e.g. a greater than ">" operator) the minimum leaf spanning tree. The leaves of a spanning 

which identifies a range of rows. In the example, since tree of the equivalent graph correspond to a set of indexes 

multi-column index 1200 is built upon column c 1024 and 5 that can cover the anticipated query types, and minimizing 

column b 1022 in that order, point lookups referring only to the number of leaves in such a spanning tree results in an 

column c 1024 can profitably use multi-column index 1200. efficient set of indexes. 

The following QUERY 3 is an example of query that can use One aspect of the invention is a computer-implemented 

a point lookup on multi-column index 1200: method and a computer-readable medium bearing instruc- 

^ 10 lions arranged to cause one or more processors to perform a 

select * from Tl method of creating one or more indexes for a body of data 

arranged in columns, which indexes are used to support 
w ere c=2 query types referencing respective combinations of one or 
In certain circumstances, it may be known to a relationa cohlmns method ^ the ste of . buildin , 
database system that there are particular combinations of 15 h based Qn me live combinations; nndi ng a mini- 
columns of a table that are most likely to be referenced m mum kaf ; &ee for ^ Q aQ( , creati Qne Qr 

f J° r ™ am ?. ' 'IT 3 ' b % k ?? wn for Table T1 1000 more indexes based on the minimum leaf spanning tree, 

that QUERIES 1, 2, and 3, are fairly common operations, , r , 

referencing combinations of columns {a}, {b, }, and {c}, . Another aspect of the invent.on is a computer- 

respectively. In addition, it may also be known that combi- w implemented method and a computer-readable medium 

nations of columns {a, b, c} and {a, b, c, d} are commonly bearm S "^ructions for finding a minimum leaf spanning 

used in queries. Conversely, many other combinations of trcc for a dacct f ^cychc graph (DAG) by finding an mitid 

columns are rarely referenced in queries received by the spanning tree for the DAG and establishing the initial 

database. In the example, it may be a relatively rare occur- s P anmn e te a ^ent spannmg tree. If an augmenting 

rence that only column d is referenced in queries. « P ath ls determined to exist for the current spanning tree, then 

Since indexes are useful in improving the processing a new s P mnln Z ««* ha ™S f ^ wer l caves than . c ™ Bn | 

performance of a relational database, one approach for spanmng tree is produced based on the augmenting path and 

providing indexes would be to provide an index built on est abkshed as the current spanning tree. The steps of nnding 

every combination of table columns frequently referenced in an augmenting path and producmg a new spanning tree with 

queries received by the database. In the example presented jo a reduced numbcr ° f leaves are ^tilcdly performed until 

above, this approach would call for an index to be created for an <Wnting path car. no longer be found The current 

each of the five frequent combinations of columns, viz. {a}, s P anmn g f ee c at the end of 1015 loo P ls established as the 

{c}, {b, c}, {a, b, c} and {a, b, c, d}. minimum leaf spanning tree. 

However, building and maintaining an index is costly. For Additional objects, advantages, and novel features of the 

example, each time a row is added to Table Tl 1000, an entry 3S present invention will be set forth in part in the description 

corresponding to the added row must be added to each index that follows, and in part, will become apparent upon exami- 

built upon the table. Thus, if there are five indexes built upon n«i°n or may be learned by practice of the invention. The 

table Tl 1000, then the five indexes have to be updated each objects and advantages of the invention may be realized and 

time a row is added to the table. Likewise, each index built obtained by means of the instrumentalities and combinations 

upon a table must be updated each time a row is deleted from 40 particularly pointed out in the appended claims, 

the table or a column value referenced by an index is BRIEF DESCRIPTION OF THE DRAWINGS 
modified. 

SiDce a query referencing a first combination of columns The present invention is illustrated by way of example, 

can use an index built upon a second combination of and not by way of limitation, in the figures of the accom- 

columns if the first combination is a prefix of the second 45 panying drawings and in which like reference numerals refer 

combination, it is advantageous to use the same index for a to similar elements and in which: 

query referencing the first combination of columns and for FIG. 1 depicts a computer system that can be used to 

a query referencing the second combination of columns. For implement the present invention; 

example, QUERY 2 and QUERY 3, referencing column FIG. 2 depicts a directed, acyclic graph representing 

combinations {b, c} and {c}, respectively, can use index 50 prefix relationships for an exemplary combination of col- 

1200, which was built upon columns c and b. Both queries umns according to an embodiment of the present invention; 

realize the performance benefits of using an index, but the a , . . 4 t t , . . . CTP 

r c J.. 1 • . 1 Mu. 3 depicts a spanning tree or the graph shown in FIG. 

maintenance costs or a second index are eliminated. 2- 

Failing to create an index than can handle an anticipated ' n ^ A . _ .„ . c _ t . 

query type results in increased access and retrieval costs of ss J IG - 4 15 \ flowchart illustrating steps of findmg an 

executing the query. On the other hand, creating a separate efficient set of indexes according to an embodiment of the 

index to handle each respective anticipated query type may present invention, 

result in excessive index maintenance costs, when it is FIG - 5 depicts a data structure that can be used to 

possible for two queries to share an index. implement the graph and spanning tree shown in FIG. 3; 

60 FIG. 6 is a flowchart illustrating steps of finding a 

SUMMARY OF THE INVENTION minimum leaf spanning tree according to an embodiment of 

What is needed is a technique for determining a set of me present invention; 

indexes for a table that can efficiently handle a group of FIG, 7(a) illustrates an augmenting path for a spanning 

anticipated query types, each query type referencing a tree in a directed acyclic graph; 

respective combination of the table's columns. 65 FIG. 7(b) illustrates a new spanning tree related to the 

This and other needs are met by the present invention in spanning tree and augmenting path shown in FIG. 7(a) that 

which an equivalent graph is built based on the combination has a fewer number of leaves; 



04/01/2003, EAST Version: 1.03.0002 



6,105,018 

5 6 

FIGS. 8(a) and 8(6) are flowcharts illustrating steps of more instructions contained in main memory 106. Such 

finding an augmenting path for a spanning tree of a graph instructions may be read into main memory 106 from 

according to an embodiment of the present invention; another computer- re ad able medium, such as storage device 

FIG. 9 depicts a minimum leaf spanning tree for the graph U0 ; Execution of the sequences of instructions contained in 

shown in FIG 2- 5 main memory 106 causes processor 104 to perform the 

' process steps described herein. One or more processors in a 

FIG. 10 depicts an exemplary table; multi-processing arrangement may also be employed to 

FIG, 11(a) depicts an index built upon the table shown in execute the sequences of instructions contained in main 

FIG. 10; memory 106. In alternative embodiments, hard-wired cir- 

FIG. 11(b) illustrates a B-Tree implementation of the 10 cuitry may be used in place of or in combination with 

index shown in FIG. 11(a); and software instructions to implement the invention. Thus, 

FIG. 12 depicts a multicoluran index built upon the table embodiments of the invention are not limited to any specific 

shown in FIG 10 combination of hardware circuitry and software. 

The term "computer- readable medium" as used herein 

DESCRIPTION OF THE PREFERRED 15 refers to any medium that participates in providing instruc- 

EMBODIMENT tions to processor 104 for execution. Such a medium may 

. take many forms, including but not limited to, non-volatile 

Amethod and apparatus are described for creating one or media ^ volatilc mcdia> and tran smission media. Non-volatile 

more indexes for a body of data arranged in columns to media indud ^ for cxample> optical or magnet i c disks, such 

support a plurality of query types, each of which referencing as storage device 110 Volatile media inchlde dynamic 

a respective combmation of one or more of said columns. In memoryj such ^ main mcmory 106 Transmission media 

the following description, for purposes of explanation, include coaxial cables, copper wire and fiber optics, includ- 

numerous specific details are set forth m order to provide a ing the wires that comprisc bus 102 . Transmission media can 

thorough understanding of the present invention It will be alsQ takc the fofm of acoustic 0f hght waves> such as those 

apparent, however, to one skilled in the art that the present " gcneratcd during radio f reqU ency (RF) and infrared (IR) data 

invention may be practiced without these specific details. In communications. Common forms of computer-readable 

other instances, well-known structures and devices are mcdia indudc> or cxamplCt a floppy disk , a flexiblc ^ 

shown in block diagram form in order to avoid unnecessarily hard disk> magaetic tape, any other magnetic medium, a 

obscuring the present invention. CD-ROM, DVD, any other optical medium, punch cards, 

tj , « . 3 q paper tape, any other physical medium with patterns of 

Hardware uverview ho] ^ & RAM, a PROM, and EPROM, a FLASH-EPROM, 

FIG. 1 is a block diagram that illustrates a computer any other memory chip or cartridge, a carrier wave as 

system 100 upon which an embodiment of the invention described hereinafter, or any other medium from which a 

may be implemented. Computer system 100 includes a bus computer can read. 

102 or other communication mechanism for communicating 35 Various forms of computer readable media may be 

information, and a processor 104 coupled with bus 102 for involved in carrying one or more sequences of one or more 

processing information. Computer system 100 also includes instructions to processor 104 for execution. For example, the 

a main memory 106, such as a random access memory instructions may initially be borne on a magnetic disk of a 

(RAM) or other dynamic storage device, coupled to bus 102 remote computer. The remote computer can load the instruc- 

for storing information and instructions to be executed by ^ tions into its dynamic memory and send the instructions over 

processor 104. Main memory 106 also may be used for a telephone line using a modem. A modem local to computer 

storing temporary variables or other intermediate informa- system 100 can receive the data on the telephone line and 

tion during execution of instructions to be executed by use an infrared transmitter to convert the data to an infrared 

processor 104. Computer system 100 further includes a read signal. An infrared detector coupled to bus 102 can receive 

only memory (ROM) 108 or other static storage device 45 the data carried in the infrared signal and place the data on 

coupled to bus 102 for storing static information and instruc- bus 102. Bus 102 carries the data to main memory 106, from 

tions for processor 104. A storage device 110, such as a which processor 104 retrieves and executes the instructions, 

magnetic disk or optical disk, is provided and coupled to bus The instructions received by main memory 106 may option- 

102 for storing information and instructions. ally be stored on storage device 110 either before or after 

Computer system 100 may be coupled via bus 102 to a 50 execution by processor 104. 

display 112, such as a cathode ray tube (CRT), for displaying Computer system 100 also includes a communication 

information to a computer user. An input device 114, includ- interface 118 coupled to bus 102. Communication interface 

ing alphanumeric and other keys, is coupled to bus 102 for 118 provides a two-way data communication coupling to a 

communicating information and command selections to network link 120 that is connected to a local network 122. 

processor 104. Another type of user input device is cursor 55 For example, communication interface 118 may be an inte- 

control 116, such as a mouse, a trackball, or cursor direction grated services digital network (ISDN) card or a modem to 

keys for communicating direction information and com- provide a data communication connection to a correspond - 

mand selections to processor 104 and for controlling cursor ing type of telephone line. As another example, communi- 

movement on display 112. This input device typically has cation interface 118 may be a local area network (LAN) card 

two degrees of freedom in two axes, a first axis (e.g., x) and 60 to provide a data communication connection to a compatible 

a second axis (e.g., y), that allows the device to specify LAN. Wireless links may also be implemented. In any such 

positions in a plane. implementation, communication interface 118 sends and 

The invention is related to the use of computer system 100 receives electrical, electromagnetic or optical signals that 

for creating an efficient set of indexes. According to one carry digital data streams representing various types of 

embodiment of the invention, creating an efficient set of 65 information. 

indexes is provided by computer system 100 in response to Network link 120 typically provides data communication 

processor 104 executing one or more sequences of one or through one or more networks to other data devices. For 



04/01/2003, EAST Version: 1.03.0002 



6,105,018 

7 8 

example, network link 120 may provide a connection query referencing column combination {a} can use an index 

through local network 122 to a host computer 124 or to data built upon columns a, b, and c (corresponding to node 240), 

equipment operated by an Internet Service Provider (ISP) there is a directed edge 214 from node 210 to node 240, 

126. ISP 126 in turn provides data communication services Likewise, since a query referencing column combination 

through the world wide packet data communication 5 {a} can use an index built upon columns a, b, c, and d 

network, now commonly 395 referred to as the "Internet" (corresponding to node 250), there is a directed edge 215 

128. Local network 122 and Internet 128 both use electrical, from node 210 to node 250. There is no edge directed from 

electromagnetic or optical signals that carry digital data node 210 t0 cither node 220 or 230, because a query 

streams. The signals through the various networks and the referencing column combination {a} cannot advantageously 

signals on network link 120 and through communication 10 use f or Ppmt lookups an index built upon either column 

interface 118, which carry the digital data to and from combination {c} or {b, c}, respectively, 

computer system 100, are exemplary forms of carrier waves Node 220 represents column combination {c} and has 

transporting the information. three Directed edges 223, 224 and 225 emanating therefrom. 

„ , 4 , , . Since a query referencing column combination {c} can use 

Computer system 100 can send messages and receive mdexes £ uilt ' * f column romb i nations l {D; c} , {a , 

data inducing program code, through the networks , net- 15 b c}> and {a> b? c> d} mnespoDdia% to aodes 23 0, 240, and 

work link 120, and communication interface 118. In the 250, respectively, the directed edges 223, 224, and 225 point 

Internet example, a server 130 might transmit a requested t0 nodes 230, 240, and 250, respectively. There is no edge 

code for an application program through Internet 128, ISP directed from node 220 to node 210, because a query 

126, local network 122 and communication interface 118. In referencing column combination {c} cannot advantageously 

accordance with the invention, one such downloaded appli- 20 use for point lookups an index built upon column combi- 

cation provides for creating an efficient set of indexes as nation {a}. 

described herein. Node 230 represents column combination {b, c} and has 

The received code may be executed by processor 104 as two directed edges 234 and 235 emanating therefrom. Since 

it is received, and/or stored in storage device 110, or other a query referencing column combination {b, c} can use 

non-volatile storage for later execution. In this manner, 25 indexes built upon either of column combinations {a, b, c} 

computer system 100 may obtain application code in the and {a, b, c, d}, corresponding to nodes 240, and 250 

form of a carrier wave. respectively, the directed edges 234 and 235 point to nodes 

240 and 250, respectively. There is no edge directed from 
Representing Relationships Between Column node 230 to either of nodes 210 or 220, because a query 
Combinations ^ Q referencing column combination {b, c} cannot advanta- 
ge relationships between the column combinations ref- geously use for point lookups an index built upon either 
erenced by anticipated query types can be expressed in the column combination {a} or {c}, respectively, 
form of a directed acyclic graph (DAG). A DAG is a data Node 240 represents column combination {a, b, c} and 
structure comprising nodes connected by edges, in which has one directed edge 245 emanating therefrom. Since a 
relationships between nodes are expressed by edges directed 35 <3 uer y referencing column combination la, b, c} can use an 
from one node in the graph to another node in the graph. The index built u P on column combination {a, b, c, d}, corre- 
term "acyclic" means that the edges do not form loops in the ?P° n( S?e to node 250, the directed edge 245 points to node 
graph; thus, travelling from node to node in an acyclic graph 25 ° ^ "J? ed f d * ec * d from node 240 l f ° an y of 
via directed edges would eventually terminate in a node **** 210 > 220 ;. ™ d 230, because a query referencing 
, . , A , c column combination {a, b, c} cannot advantageously use for 
having no edge emanating therefrom. 40 im bok an ^ buil / of commn / om5ina . 

Recall that if a first combination of columns is a prefix of tion { a } ? ^ ^ d { b> c ^ respe ctively. 

a second combination of columns, then a query referencing Node 250 represents colum[1 combination {a, b, c, d} and 

the first column combination can advantageously use for has no directed edges emanating therefrom. There is no edge 

point lookups an index built upon the columns of the second directed from node 250 to any of nodes 210, 220, 230, and 

combination of columns. For example, a query referencing 45 240, because a query referencing column combination {a, b, 

column combination {c} can advantageously use an index c , d } cannot advantageously use for point lookups an index 

built upon columns c and b for point lookups, which is one built upon any of column combination {a}, {c}, {b, c}, and 

of the indexes specified by column combination {b, c}. {a, b, c} respectively. 

Thus, the column combination {b, c}, of which column ' N J de 2 00 represents an empty column combination { }, 

combination {c} is a subset, specifies at least one index that 50 which is trivially a prefix of every other column 

column combination {c} can potentially be shared. This combination, viz. {a}, {c}, {b, c}, {a, b, c}, and {a, b, c, d}, 

relationship may be expressed generally within a DAG by a corresponding to nodes 210, 220, 230, 240, and 250, respec- 

first node representing a first column combination, a second t j ve i y Accordingly, node 200 has edges 201, 202, 203, 204, 

node representing a second column combination, of which and 205 directed to nodes 210, 220, 230, 240, and 250, 

the first column combination is a subset, and an edge 5s respectively. However, since none of the other column 

directed from the first node to the second node. The exem- combinations is a prefix of the empty column combination 

plary relationship between column combinations {c} and {b, { ^ no omer no de has an edge directed therefrom to node 

c} can be represented in a DAG by a first node representing 200. 

column combination {c}, a second node representing col- A «, rooted DAG » ^ a DAG that has « a root node » from 

umn combination {b c}, and an edge directed from the first 6 o which cvery othcr node is rea chable, and, by acyclicity, the 

node to the second node. As another example, FIG. 2 f00t node fe unique and has nQ entering edges Smce node 

illustrates a DAG that expresses the relationships between 2 00 can reach every other node, but no other node can reach 

the following combination of columns: {a}, <c}, {b c} {a, node 200 the additkm of node 200 Cfeates a rQotcd DAQ 
b, c}, and {a, b, c, d}, corresponding to nodes 210, 220, 230, 

240, and 250, respectively. 65 A Spanning Tree of a Graph 

Node 210 represents column combination {a} and has two A tree is a collection of elements in which one of the 

directed edges 214 and 215 emanating therefrom. Since a elements is designated as a "root" and the remaining 



04/01/2003, EAST Version: 1.03.0002 



6,1( 

9 

elements, if any, are partitioned into one or more subtrees. 
Since this definition of a tree is recursive, one of the 
elements of the subtree is also designated as a root for the 
subtree and the remaining elements of the subtree, if any, are 
further partitioned into one or more subtrees. The root of a 
tree is the "parent" of the root of each constituent subtree; 
conversely, the root of each subtree is a "child" of the root 
of the encompassing tree. If a tree (or subtree) consists of 
only one element, that element is termed a "leaf element. 
Thus, a leaf element is not a parent of any other element in 
the tree. 

Elements of a tree can be represented as nodes of a DAG, 
and the parent-child relationship between the elements can 
be represented as directed edges in a DAG. Thus, a tree is 
a kind of a DAG. A spanning tree of a rooted DAG is a tree 
constructed from the graph using, or "spanning," all the 
nodes of the graph. Since edges of a DAG are directed, a 
spanning tree of a root DAG consists of the nodes of the 
DAG as elements and uses the root node of the DAG as the 
root of the spanning tree. A spanning tree typically has fewer 
edges than the graph it spans. For example, a node in a DAG 
may be the destination of two or more edges, but only one 
of those edges would be in any one spanning tree of the 
graph. In general, a DAG can have more edges than nodes, 
but a spanning tree has exactly one fewer edge than nodes. 
For example, the DAG depicted in FIG. 2 has thirteen edges 
for six nodes, but spanning trees of the DAG contain only 
five edges. 

FIG. 3 depicts an exemplary spanning tree for the DAG 
illustrated in FIG. 2, which represents the relationships 
between the exemplary combinations of columns. The edges 
of the DAG that belong to the exemplary spanning tree are 
depicted as solid arrows. For example, edge 201 from node 
200 to node 210 is shown as a solid arrow and belongs to the 
exemplary spanning tree. The other edges in the spanning 
tree include edge 202 from node 200 to node 220, edge 223 
from node 220 to node 230, edge 234 from node 230 to node 
240, and edge 235 from node 230 to node 250. Referring 
again to FIG. 3, edges belonging to the DAG that are not in 
the spanning tree are depicted by a dashed arrow. For 
example, edge 245 from node 240 to node 250 is shown by 
a dashed arrow and is not in the exemplary spanning tree. 
There are five edges in the spanning tree for six nodes of the 
DAG. 

Aleaf of a spanning tree is not a parent of any other node 
in the spanning tree. In other words, a leaf node does not 
have any edges directed from itself in the spanning tree. 
Referring again to FIG. 3, node 250 is a leaf node in the 
exemplary spanning tree because there are no edges ema- 
nating therefrom. Node 240 is also a leaf node of the 
exemplary spanning tree, because the only edge emanating 
therefrom, namely edge 245 from node 240 to node 250, is 
not in the exemplary spanning tree. Node 220, however, is 
not a leaf node of the exemplary spanning tree, because node 
220 has an edge 223 from node 220 to node 230 that is in 
the exemplary spanning tree and depicted with a solid arrow. 

Conventional techniques such as a depth-first search or 
breadth-first search exist for finding a spanning tree for a 
DAG. A depth-first search is typically implemented by a 
recursive subroutine in which edges are successively fol- 
lowed from node to node until a leaf node is reached. When 
a leaf node is reached, the depth- first search backs up and 
checks previous nodes for additional edges to as-yet- 
unvisited nodes to add to the initial spanning tree. 

For example, referring back to FIG. 2, root node 200 has 
five exiting edges 201, 202, 203, 204, and 205. Among the 



15,018 

10 

edges 201, 202, 203, 204, and 205, a depth-first search may 
chose and traverse edge 202 to reach node 220, which is the 
source for nodes 223, 224, and 225. Subsequently, the 
depth -first search may traverse edge 223 to reach node 230. 

5 At node 230, edge 235 may be traversed to reach node 250, 
which lacks an edge emanating therefrom. Accordingly, the 
depth-first search backs up a level to node 230 and selects a 
remaining edge, namely 234, to traverse, reaching node 240. 
Although node 240 has an edge 245 emanating to node 250, 

10 node 250 has already been visited by the depth -first search, 
so that the depth-first search returns to node 230. Since all 
destination nodes from node 230, viz. nodes 240 and 250, 
have also been visited, the depth-first search returns back to 
node 220 and thence to root node 200. At this point, the 

15 depth-first search traverses edge 201 to reach node 210, 
since edges 203, 204, and 205 point to visited nodes 230, 
240, and 250, respectively. 

This exemplary depth-first search finds an initial spanning 
tree comprising edges 201, 202, 223, 234, and 235 and 

20 illustrated in FIG. 3. The particular spanning tree found by 
a depth-first search is typically dependent on the particular 
order in which edges from a node are consulted or stored in 
a data structure. 

25 Correspondence Between Spanning Tree Leaves 
and Indexes 

The leaves of a spanning tree of a graph representing 
subset relationships between combinations of columns cor- 

30 respond to the set of indexes that can cover all the antici- 
pated queries. For example, the spanning tree depicted in 
FIG. 3 has three leaves: node 210 representing column 
combination {a}, node 240 representing column combina- 
tion {a, b, c}, and node 250 representing column combina- 

35 tion {a, b, c, d}. Accordingly, the exemplary spanning tree 
indicates that three indexes may be built upon the corre- 
sponding column combinations in order to support all the 
exemplary anticipated query types. In the example, since 
node 210 representing column combination {a} is a leaf 

^ node in the spanning tree, the spanning tree indicates that an 
index may be built upon column a. 

For indexes built on a plurality of columns, i.e., multi- 
column indexes, the order of columns is significant. More 
specifically, a multi-column index is built having the pre- 

45 fixed columns placed before non-prefixed columns. In other 
words, those columns specified in the ancestor nodes of a 
leaf node in the spanning tree come before those columns 
specified only in the leaf node. In the example, leaf node 240 
represents column combination {a, b, c} and has node 230 

50 representing column combination {b, c} as a parent and 
thence node 220 representing column combination {c}. 
Thus, the multi-column index corresponding to node 240 is 
built on column c (specified in node 220), column b 
(specified in node 230), and then column a (specified in node 

55 240). Leaf node 250 includes two columns not specified in 
any ultimate parent node, viz. columns a and column d. In 
this case, the order of non-prefixed columns is immaterial 
for building an index; thus, either an index on columns c, b, 
a, and d or on columns c, b, d, and a may be built. 

60 In the example, the three indexes, a first index built on 
column a, a second index built on columns c, b, and a, and 
a third index built on columns c, b, d, and a, can cover the 
exemplary anticipated query types referencing column com- 
binations {a}, {c}, {b, c}, {a, b, c}, and {a, b, c, d}. More 

65 specifically, a query referencing column combination {a} 
can use the first index built on column a. The queries 
referencing column combinations {c} and {b, c} can use 



04/01/2003, EAST Version: 1.03.0002 



6,105,018 



11 



12 



either the second index built on columns c 7 b, and a, or the 
third index built on columns c, b, d, and a. The query 
referencing column combination {a, b, c} can use the second 
index built on columns c, b, and a. Finally, the query 
referencing column combination {a, b, c, d} can use the third 
index built on columns c, b, d, and a. 

A Minimum Leaf Spanning Tree 

A minimum leaf spanning tree of a graph is a spanning 
tree of the graph such that no other spanning tree of the 
graph has a fewer number of leaves than the minimum leaf 
spanning tree. In other words, the minimum leaf spanning 
tree of the graph has the fewest possible, or minimum, 
number of leaves. Moreover, a plurality of minimum leaf 
spanning trees can exist for a given DAG. 

A minimum leaf spanning tree differs conceptually from 
a conventional "minimum spanning tree/' which minimizes 
the aggregate weight of edges within a graph. For clarity, 
such conventional minimum spanning trees are termed 
herein as "minimum edge- weight spanning trees." Since all 
spanning trees of a graph have the same number of edges, 
one less than the number of nodes, minimum edge-weight 
spanning trees are most meaningful for graphs that have 
weighted edges. There are a variety of well-known tech- 
niques for finding minimum edge -weight spanning trees, for 
example, Kruskars algorithm, Prim's algorithm, and Boru- 
vka's algorithm, none of which, however, are designed to 
find a minimum leafspanning tree. A minimum leaf spanning 
tree, on the other hand, is a spanning tree that has a minimal 
number of leaves without consideration of weights of the 
edges. Thus, a minimum leaf spanning tree is well defined 
even for graph whose edges are not assigned weights. 

As mentioned hereinabove, there is a need for determin- 
ing a minimal set of indexes for a table that can efficiently 
handle a group of anticipated query types, each query type 
referencing a respective combination of the table's columns. 
Since the leaves of a spanning tree of a graph representing 
the relationships between the column combinations indicate 
indexes that can cover all the anticipated query types and 
since for any set of n indexes there is a spanning tree having 
at most n leaves, the leaves of a minimal spanning tree 
indicate a minimal set of indexes that can efficiently handle 
the anticipated query types. Accordingly, one embodiment 
of the present invention meets this need by performing the 
steps illustrated in FIG. 4. 

Referring to FIG. 4, a directed acyclic graph (DAG) 
equivalent to the anticipated column combinations (step 
400) is built. A minimum leaf spanning tree is found for the 
DAG (step 402). A group of indexes is then built based on 
column combinations associated with the leaves of the 
minimum leaf spanning tree (step 406). Each of these steps 
shall be described in greater detail hereinafter. 

Building an Equivalent Graph 

Referring to step 400 in FIG. 4, a directed, acyclic graph 
(DAG) equivalent to the pattern of anticipated query types 
is built. In particular, the nodes of the DAG correspond to 
the respective column combinations, and the directed edges 
correspond to a subset relationship existing between column 
combinations. Moreover, the DAG is built with a root node 
that can reach every other node in the DAG. More formally, 
the nodes of such a DAG are { } and each column combi- 
nation n 17 and the edges of the DAG are { }-»n ; for all j and 
if and only if n^ciy For the working example of 
anticipated query types referencing column combinations 
{a}, {c}, {b, c}, {a, b, c}, and {a, b, c, d}, such an equivalent 
DAG is illustrated in FIG. 2, as described in more detail 
hereinbelow. 



10 



15 



20 



30 



35 



45 



50 



55 



There is a variety of techniques and data structures for 
implementing a directed, acyclic graph, but the present 
invention is not limited to any particular technique or data 
structure. An "object -based" approach defines an object (e.g. 
a structure, record, instance of an abstract data type, or other 
equivalent construct depending on the programming 
language) to hold information for each vertex. Edges in an 
object-based approach are implemented by another object or 
equivalent construct, which includes a reference (e.g 
pointers, cursors, indexes, addresses, and the like), to the 
vertices they connect. In an adjacency list implementation, 
the edges that come from a vertex are implemented as a 
collection of references to the respective vertices the edges 
connect. An incidence list combines the object-based 
approach and the adjacency list approach, in which each 
vertex object includes a linked list of edge objects pointing 
to vertices. 

An incident list representation is depicted in FIG. 5 for the 
DAG in the working example. Vertex object 500 contains 
data for vertex 200 and may include the following fields: an 
optional NODE field to hold an identifier of the vertex (200), 
a COL field for the column combination represented by the 
node, a linked list EDGES of edge objects 501, 502, 503, 
504, and 505, and PARENT field to indicate the parent node 
in a spanning tree for the graph. As described in more detail 
herein below, a MARK field is used to keep track of whether 
the node has been "visited" within a pass. Each edge object 
501-505 contains a reference to another vertex and a link to 
the next edge in the list. For example, the reference in edge 
object 501 is "210" indicating vertex 210 of the graph. In 
FIG. 5, the reference is a value or "cursor" of the identifier 
for the associated vertex, however, other implementations 
may employ a pointer to the associated vertex object, such 
as a virtual address of the start of the associated vertex 
object. 

Referring again to FIG. 5, vertex object 510 represents 
vertex 210 of the graph (NODE field) and has a linked list 
of edge objects containing edge objects 514 and 515, which 
refer to vertex objects 540 and 550, respectively. Vertex 
object 520 represents vertex 220 of the graph (NODE field) 
and has a linked list of edge objects containing edge objects 
523, 524 and 525, which refer to vertex objects 530, 540 and 
550, respectively. Vertex object 530 represents vertex 230 of 
the graph (NODE field) and has a linked list of edge objects 
containing edge objects 534 and 535, which refer to vertex 
objects 540 and 550, respectively. Vertex object 540 repre- 
sents vertex 240 of the graph NODE field) and has a linked 
list of edge objects containing edge object 545, which refers 
to vertex object 550. Finally, vertex object 550 represent 
vertex 250 of the graph (NODE field) and has a null linked 
list of edge objects. 

Other approaches include an "adjacency matrix" in which 
cells of a square matrix having rows and columns indexed by 
vertices indicate whether the vertices for the row and col- 
umn are connected. Another matrix is an "incidence matrix" 
has rows indexed by vertices and columns indexed by edges, 
in which each cell in the matrix indicates whether the vertex 
and the edge are incident. Other techniques can be used to 
implement a DAG equivalent to the pattern of column 
combinations referenced by anticipated query types. 

A spanning tree of a graph can be implemented by a 
separate data structure that includes pointers to the vertex 
objects or, preferably, within the same data structure that 
implements the graph and reusing the vertex objects. Since 
a root node in a spanning tree can have a plurality of 
subtrees, the subtrees of a root node can be represented by 
a second linked list of child edge objects. Another approach 



04/01/2003, EAST Version: 1.03.0002 



6,105,018 

13 14 

is to include an additional flag in each edge object of the mally defined as follows: given a spanning tree T of a graph 

associated linked list of edge objects, wherein the flag G=(V, E), an augmenting path for T is a sequence v 0 , v lf , 

indicates whether the edge represented by the edge object is . , v t of vertices from V such that: 

in the spanning tree. Since each node in a spanning tree can (1) v 0 is a leaf of T; 

have at most one parent node, yet another approach includes 5 (2) for 0<j<k each of v } - has exactly one child in T; 

an extra field in each vertex object to indicate the parent (3) v^ has at Least two children in T; and 

node in the spanning tree. In the data structure illustrated in (4) for 0^i<k, there exists a vertex u ( such that (a) the 

FIG. 5, the PARENT field of vertex object 550 includes a edge v,-»u,- is in E but not in T and (b) the edge v M -*u t 

reference 230 (or an equivalent such as a pointer) to its is in T. 

parent, vertex 230, in the spanning tree depicted in FIG. 3. 10 This formal definition can be visualized with reference to 

The PARENT field of vertex object 540 indicates vertex 230 FIG. 7(a), which depicts a portion of a directed, acyclic 

as the parent, and the PARENT fields of vertex objects 510, graph comprising nodes v 0 , v lt v 2 , . . . V te u 0 , u lf . . . , n k _ 1} 

520, and 530 indicate vertices 200, 200, and 220, and w. The edges represented by solid arrows are in the 

respectively, as the parent. The PARENT field of vertex spanning tree and the graph, and the edges represented by 

object 500 is null, since vertex 200 is the root of the is dashed arrows are in the graph but not in the spanning tree, 

spanning tree. In FIG. 1(a), node v 0 is a leaf node, condition (1), and nodes 

Vi, v 2 , . . . , v k are non-leaf nodes. Each of nodes u 0 , u 1( . . 

Finding a Minimum Leaf Spanning Tree ? Ufci> anc i w can b e either a leaf node or a non-leaf node. 

After the equivalent DAG is constructed, an embodiment Considering condition (2), each of nodes v x , v . , y 

of the present invention finds a minimum leaf spanning tree 20 has onl y one chlld m ^ e Spanning tree, indicated by a solid 

of the DAG (step 402). Although a plurality of minimum arrow " For c ^ ditl0n ( 3 )> node v * has two chlldren * oode u^ 

leaf spanning trees may exist for a DAG, only one minimum and node L w ' f°J each of nodes Uo, u . . . , the parent 

leaf spanning tree need be found to determine a minimal set node to the left > ma * ed b * a das , hed ar [ osv >. * not m * e 

of indexes for a given set of anticipated query types. On the s P anmn S tree > and the P a ' ent node *° ^ nght is m the 

other hand, it is contemplated that other embodiments of the 25 spanning tree, meeting conditions (4a) and (4b) respectively, 

present invention are configured to find two or more of 0ne au ^ tin g P ath for the s P annm S m the working 

minimum leaf spanning trees of a DAG and choose one of exam P le of FIG - 3 uprises v 0 w node 240 and v, as node 

them based on ranking criteria. For example, the net cost for ^ where ^ and * node 250. Referring to the 

using indexes indicated by leaves of a minimum leaf span- definition, v 0 (node 240) is a leaf of T, since node 240 does 

ning tree can be calculated by computing selectivity factors 30 not have in the spanmng tee emanating therefrom^ 

for the anticipated queries multiplied by a cost metric for Condition (2) is trivially satisfied since k-1. The third 

each index as disclosed in the commonly assigned, U.S. condition is met since v, (node 230) has two children in the 

application Ser. No. 08/808,094 entitled "Index Selection for ? anmn S tree: . node 240 via edge 234 and node 250 via edge 

an Index Access Path" and filed on Feb. 28, 1997 by Hakan Concerning condition (4), edge v 0 -Uo (edge 245 from 

Jakobsson, Michael Depledge, Cetin Ozbutin, and Jeffrey I. 35 node ^ t0 node 250 ) 1S not in the spanning tree, but edge 

Cohen (now U.S. Pat. No. 5,924,088), incorporated herein v i^ u o ( ed S e 235 &om node 230 to node 2S0 ) 15 in the 

by reference. spanmng tree. 

rt .. , „ . FIGS. 8(a) and 8(b) are flowcharts illustrating one method 

One method of finding a — leaf spanning tree is for finding an augmenting path for a spanning tree of a DAG, 

illustrated in FIG. 6 In step 600 in which an initial spanning n exists. Step 800 controls a loop that iterates through each 

tree is found for the DAG as by conventional techniques j £af no(Je jn ^ Qumnl ^ tfee ^ an a mi 

such as a depth-first search and a breadth-first searches ^ js found ^ it£ratioQ can be rformed b such 

described in more detail herein above. The present invention techni „ a order of ±e m tree . If all 

is not limited to any particular initial spanning tree or to any ^ fcif ^ ^ beeQ exhausted without flndin m 

particular method of finding an m.tial spanning tree, which 4J a atin th (see st 806) then ^ j mntt6lkA by 

may vary from implementat.on to implementation. In the g00 tetminates and execution passes t0 step g0 2, where 

working example, one initial spanning tree illustrated in ^ ^ Qf an a mi th is si led M b returni 

FIG. 3 and has three leaves: node 210, node 240, and node a ..^ boo]ean value of equivalenl 

During the execution of the loop at step 804, each leaf 

In step 602, the initial spanning tree is established as the 50 no d e under consideration is established as the current node 

current spanning tree for a loop that repeatedly finds related for filing an augmenting path starting from the current 

spanning trees with fewer leaf nodes until no more are node . Referring back to the definition of an augmenting 

found. According to one embodiment of the present path, the requirement that the start of the augmenting path be 

invention, such a related spanning tree is determined by a ] eaf node satisfies condition (1) that v 0 is a leaf of T. 

finding an augmenting path for the current spanning tree in 5S During step 804, the DAG is search for an augmenting path 

the DAG, if it exists (steps 604 and 606) and adjusting the that starls at me curren t node. In one implementation, this 

edges of the spanning tree based on the augmenting path to searcn j s performed in a separate subroutine (e.g. a C 

produce a spanning tree with a reduced number of leaves function) whose operation is illustrated in FIG. 8(b) starting 

(step 608). at ste p 810. If the result of searching for an augmenting path 

„. A „ 60 in step 804 is true, then the method indicated that an 

Finding an Augmenting Path augmenting path has been found (step 806). Otherwise, 

In step 604, an augmenting path for the current spanning execution loops back to step 800 where another leaf node, if 

tree of the DAG is found, if it exists. One exposition of an available, is considered. 

augmenting path in a different context, viz. maximal match- Referring to FIG. 8(b), in step 810, the current node is 

ings in a bipartite graph, is found in Aho, Hopcroft & 65 checked for an edge in the DAG that is directed from the 

Ullman, Data Structure & Algorithms (Reading, Mass.: current node. Referring back to the working example 

Addison-Wesley, 1983). An augmenting path may be for- depicted in FIG. 3, leaf node 250 does not have such an 



04/01/2003, EAST Version: 1.03.0002 



6,105,018 

15 16 

edge; consequently, the "NO" branch is taken, indicating condition (3) that v^. has at least two children in T is satisfied 

that an augmenting path is not found for the current node and an augmenting path has been found. In the working 

(step 812). If step 810 to find a particular augmenting path example, since v 1 as node 230 has at least two children in the 

was called from step 804 in the main loop, then returning a current spanning tree, namely node 240 and node 250, an 

not found indication causes another iteration of the loop 5 augmenting path has been found comprising v 0 as node 240 

controlled by step 800 for another leaf node, if present. With ^ Vj as no d e 230, where k=*l and is node 250. 

respect to leaf node 240, there is an edge directed therefrom: Accordingly, execution branches to step 820, where the fact 

edge 245 directed to node 250. Accordingly, execution that an augmenting path is found is signaled, as by returning 

proceeds to step 814 Since the current node is a leaf node, a < w , bookan value Qr ivalent 

any edge directed therefrom is in the DAG but not in the Qn ^ Qther hand tf ^ cd tQ ^ Qther ^ nodcs of 

spanning tree. Therefore, the test in step 810 checks for . , ' . . ° . t 

condition (4a) that edge v,-mi, is in E but not in T. In the the ™ de f e not ln ?*. c T? nt s P anmn 8 lKe ' «>?*- 

working example, edge v 0 -u 0 is edge 245, which is ^ on j 4a ) th * f& » m bu ' not m J* s ** & f- 

directed from node 240 as v 0 to node 250 as u„, is not in the Condition (2) that each of V/ has exactly one child in T is also 

current spanning tree satisfied, since this parent node has one child in the current 

At step 814, the destination node of the edge is checked 35 spanning (determined in step 814) but no other child in the 

to determine whether its parent node in the spanning tree has current spanning tree (determined in step 818). 

been visited. In the working example, the parent of desti- Consequently, the search for an augmenting path is contin- 

nation node 250 in the spanning tree is node 230, since edge ued using one of the other child nodes as the current node 

235 is in the spanning tree. On the other hand, node 200, the (step 822). One approach is preferably a recursive call to 

root of the DAG and the spanning tree, does not have a 20 step 810, as a depth-first search, but other equivalent 

parent and, consequently, does not meet this condition approaches, such as a search with an explicit stack or other 

Referring again to FIG. 5, it is evident that the parent of a supplementary data structure, may be employed. If the result 

node can be readily determined according to one embodi- of searching for a continuation of the augmenting path 

ment of the present invention by accessing the PARENT succeeds, then execution branches to step 820 where this 

field. More specifically, the value of the PARENT field in 25 success is signaled. On the other hand, if the result of 

vertex object 550 representing node 250 refers to node 230. searching for a continuation of the augmenting path does not 

By finding a parent node in the spanning tree for the succeed, then execution backtracks to step 816 to examine 

destination node, condition (4b) that edge v f+1 -*u,- is in T is another child node, if it exists, for a potential continuation of 

satisfied, since edge 235 from node 230 as v 1 to node 250 as the augmenting path. If no other child node exists, then 

u 0 is part of the current spanning tree. 30 execution reaches step 812 indicating that an augmenting 

There are at least two advantages for checking whether path cannot be continued from the current node, 

the parent node has been visited. One benefit is avoiding Reducin a S annin Tree 

infinite loops, and another benefit is the elimination of e ucm & a P annm § ree 

superfluous attempts to find an augmenting path for nodes Referring back to FIG. 6, execution proceeds to step 606, 

already determined not to contain an augmenting path. There 35 where the existence of an augmenting path possibly found in 

is a variety of techniques for determining whether a node has step 604 is tested. Preferably, step 604 is coded as a routine 

been visited, but the present invention is not limited to any configured to perform the steps illustrated in FIG. 8 and 

particular technique. For example, a separate data structure return a value, e.g a Boolean, indicating whether an aug- 

can be maintained to record which nodes have been visited. menting path was found by the routine. If an augmenting 

As another example, the data structure that represent vertices 40 P atn was found, then execution branches to step 608, 

in the DAG can be augmented to include a MARK field to wherein the edges of the current spanning tree are adjusted 

hold a Boolean flag that marks whether the corresponding Dascd on tne augmenting path to produce a new spanning 

node has been visited. A drawback of the Boolean flag tree having a fewer number of leaves. On the other hand, if 

approach is that the data structure for the DAG must be n° augmenting path exists, then the current spanning tree is 

traversed each time to reset the flag for each separate pass of 45 established as a minimum spanning tree (step 610) and 

finding an augmenting path. Accordingly, the MARK field execution returns back to step 404. 

preferably contains a monotonically increasing (or, Given a spanning tree of a graph and an augmenting path 

alternatively, decreasing) pass number that indicates the last for the spanning tree, a new spanning tree can be constructed 

pass in which the node was visited. Thus, determining based thereon having a fewer number of leaves. Referring 

whether a node has been visited is performed by comparing 50 back to FIG. 7(a), such a reduced leaf spanning tree is 

the MARK field of the node to the current pass number. If constructed by deleting all edges v tV1 — u,. in the augmenting 

the condition in step 814 is not met, then an indication that path and replacing them with edges v--»u ( . Thus, the parent 

an augmenting path is not found for the current node made of u- in the spanning tree changes from v iVJ to v,-. A result of 

(step 812); otherwise, execution proceeds to step 816. A this procedure is depicted in FIG. 1(b). By inspection, the 

node is considered and marked as visited when the condition 55 new spanning tree has one fewer leaf than the original 

in the next step 816 is met. spanning tree, because v 0 changes from a leaf to a non-leaf, 

In step 816, the parent node of the destination node is v a , v 2 , . . . , v k remain non-leaf nodes, and the leaf-ness of 

checked for the existence of one or more other child nodes. nodes u 0 , u a , . . . , u k _ v and w are unaffected. 

If there is no other child node, then neither condition (3) that In the working example, one augmenting path was found 

v k has at least two children in T nor condition (4a) that edge 60 comprising v 0 as leaf node 240 and v 1 as node 230, where 

v ( .-»u f is in E but not in T can be satisfied. Accordingly, k«l and u 0 is node 250. Edge 245 from node 240 (Vq) to 

execution proceeds to step 812 to indicate that an augment- node 250 (u 0 ), marked as a dashed arrow, is not in the 

ing path is not present for the current node. In the working spanning tree, and edge 235 from node 230 (v a ) to node 250 

example, however, there is another child node for parent ( u o)> marked as a solid arrow, is in spanning tree, 

node 230, namely node 240 via edge 234. 65 Accordingly, a reduced leaf spanning tree is constructed by 

If the edge to any other child node of the parent node is removing edge 235 from node 230 (v^) to node 250 (u 0 ) 

in the current spanning tree, checked by step 818, then from the spanning tree and adding edge 245 from node 240 



04/01/2003, EAST Version: 1.03.0002 



6,105,018 

17 18 

(v 0 ) to node 250 (uq) into the spanning tree. This reduced stack, and parent node 240 is visited. At node 240, column 

leaf spanning tree is illustrated iD FIG. 9 and consists of only combination {a, b, c} is pushed onto the stack so that the 

two nodes 210 and 250 are leaf nodes in the spanning tree, stack contains the following elements: ({a, b, c}, {a, b, c, 

whereas the spanning tree depicted in FIG. 3 comprises three d}). Subsequently, at next parent node 230, column combi- 

nodes. Edge 235 from node 230 (vj) to node 250 (u 0 ) is 5 nation {b, c} is pushed onto the stack so that the stack 

marked with a dashed arrow and is not in the spanning tree, contains the following elements: ({b, c}, {a, b, c}, {a, b, c, 

and edge 245 from node 240 (v 0 ) to node 250 (u 0 ) is marked d}). Subsequently, at next parent node 220, column combi- 

with a solid arrow and is in the spanning tree. The reduction nation {c} is pushed onto the stack so that the stack contains 

of the current spanning tree based on the augmenting path the following elements: ({c}, {b, c}, {a, b, c}, {a, b, c, d}). 

can be performed by a separate subroutine or integrated with 10 Since the parent of node 220 is the root node, the order of 

the routine that found the augmenting path. In the latter case, columns can be determined by pulling of the top elements of 

the edges can be flipped in or out of the current spanning tree the stack. First, column combination {c} is pulled off the 

at step 812 by resetting the PARENT field of the vertex stack; thus the first column in the multi-column index is 

object representing node u,- to reference node v ; . column c. The next column combination to be pulled off the 

With reference to FIG. 6, after adjusting the edges of a 15 stack is column combination {b, c} and the new column b 

current spanning tree to produced a reduced leaf spanning is placed after column c, resulting in the order: c and b. Next, 

tree in step 608, the reduced leaf spanning tree is established column combination {a, b, c} is pulled off the stack and new 

as the new current spanning tree and steps 604, 606, and 608 column a is added to the order of columns, resulting in the 

are repeated until an augmenting path can no longer be order: c, b, and a. Finally, column combination {a, b, c, d} 

found. In this situation, the loop terminated and the current 20 is pulled off the stack and new column d is added to the order 

spanning tree is returned to step 404 as the minimum leaf of columns, resulting in c, b, a, and d. 

spanning tree. Therefore, the two indexes of the working example, a first 

Building the Indexes from the Minimum Leaf index built on 0011111111 a > and a second index built on 

Spanning Tree columns c, b, a, and d, can cover the exemplary anticipated 

, ...... , , r ■ 25 query types referencing column combinations {a}, {c}, {b, 

As described herein above, the leaves of a spanning tree c}? { b c} and { b ^ d} More ificall a 

of a graph representing prefix relationships between com- referencing column comb ination {a} can use the first index 

binations of columns correspond to a set of indexes thaX ^ Qn columD a ^ referencing column 

™ & a ^ e L anUc ]P ated queries Referring back to FIG. 4, Minbilllltions {c} can use me mu iti.cohimn index built upon 

in step 404 the indexes to be built are determined from the 30 b and ^ cohlmn Q ^ A 

minimum leaf spanning tree generated within step 402. referencing columns b and c can use the multi-column 
Since a minimum leaf spanning tree has a minimum number mdex ^ Qn ^ ^ and d becmse c and 
of leaves and since for any set of n indexes there is a b fifSt similarlV) the query referencing column 
spanning tree having at most n leaves, determining the set of {a , bj c} and the query refer encing column 
indexes to be built from the minimum leaf spanning tree 35 combination {a, b, c, d} can use the multi-column index 
results in a minimum number of indexes being built, thereby buiu on columQS ^ ^ and d 
reducing the costs of maintaining indexes while still effi- 
ciently handling the anticipated query types. In the working Subquery Snapshots 
example, the minimum leaf spanning tree depicted in FIG. 

9 has two leaves: node 210 representing column combina- 40 The present application may be applied to improve the 
tion {a} and node 250 representing column combination {a, efficiency of fast refresh of snapshots defined by a query 
b, c, d}. Accordingly, the minimum leaf spanning tree containing a subquery. A snapshot is a body of data con- 
indicates that two indexes may be built upon the correspond- structed of data from a "master" table. The data contained 
ing column combinations in order to support all the exem- within a snapshot is defined by a query that references the 
plary anticipated query types. Since node 210 representing 45 master table and optionally other tables, views, or snapshots, 
column combination {a} is a leaf node in the spanning tree, A snapshot can be refreshed periodically or on demand by a 
the minimum leaf spanning tree indicates that an index may user to reflect the current state of its corresponding base 
be built upon column a. tables. 

Since the order of columns in a multi-column index is One method of refreshing a table is called "fast refresh," 

significant, the column combinations that form prefixes of 50 which transfers to the snapshot only those changes to the 

other column combination must occur before the non- master table that have been made since the last refresh of the 

prefixed columns. Specifically, the column combinations are snapshot. A "master log"' file can be employed to track and 

built in reverse order of the column combinations repre- record the rows that have been updated in the master table, 

sen ted by nodes on the path from a leaf node to the root. When a snapshot is refreshed, only the appropriate rows in 

According to one embodiment of the present invention, this 55 the master log need to be applied to the snapshot table. In a 

ordering may be determined by traversing via the PARENT networked environment, only those modified rows found at 

field the minimum leaf spanning from a leaf node to the root the master site are transferred across the network and 

while building a stack of column combinations. Due to the updated or inserted into the snapshot. Rows deleted in the 

LIFO (last in, first out) nature of a stack, pulling column master table are also deleted in the snapshot. Fast refresh is 

combinations from the stack results in a proper, reverse eo typically faster, more efficient, and involves less network 

order of column combinations. The stack need not be traffic than another form of refresh, called "complete 

explicit, as described herein, because a series of recursive refresh," in which the snapshot definition query is merely 

function calls achieve a similar result by an implicit use of reissued. 

a call stack. Other approaches such as lists and queues may As described in more detail in the commonly assigned 

also be adopted. 65 U.S. application Ser. No. 08/880,928, entitled "Fast Refresh 

In the working example, starting from multi-column leaf of Snapshots Containing Subqueries/' filed on Jun. 23, 1997 

node 250, column combination {a, b, c, d} is pushed on the by Alan Downing, Harry Sun, and Ashish Gupta, the con- 



04/01/2003, EAST Version: 1.03.0002 



6,105,018 



19 



20 



tents of which are incorporated herein by reference, a fast 
refresh can be performed on snapshots defined by a query 
that includes a subquery. For example, a subquery snapshot 
on table Tl 1000 may be defined by the following snapshot 
definition query: 

Query 4 

select * from Tl where exists (select a from T2 where 
Tl.a»T2.a) 

The nested select statement "( s d cct a from T2 where 
Tl.a=T2.a)" is a subquery, where table Tl 1000 is a master 
table and table T2 is another base table. This snapshot 
contains all the rows of table Tl 1000 in which the value of 
column a is also found in column a of table T2. If a row is 
deleted in table T2, then rows in the snapshot that depend on 
that row are deleted from the snapshot during a fast refresh. 
For this purpose, it is advantageous to have an index built on 
column a to efficiently drive the delete operation. Asnapshot 
definition query can be more complex; for example, the 
following snapshot definition query has five different sub- 
queries: 

Query 5 

select * from Tl 

where exists (select a from T2 where Tl a=T2 a) 
and exists (select c from T3 where Tl.c-T3.c) 
and exists (select b, c from T4 where Tl.b=T4.b and 
Tl.c«T4.c) 

and exists (select a, b, c from T5 where Tl.a=T5.a and 
Tl.b=T5.b and Tl.c=T5.c) 

and exists (select a, b, c, d from T6 where Tl.a-T6.a and 
Tl.b-T6.b and Tl.c-T6.c and Tl.d-T6.d) 

For QUERY 5, five column combinations are anticipated 
to be frequently referenced, namely column combinations 
{a}, {c}, {b, c}, {a, b, c}, and {a, b, c, d} . Although building 
five indexes for the respective column combinations enables 
ef&cient operation of the fast refresh, the number of indexes 
that are built is excessive, because the five indexes have to 
be updated each time a row is deleted from the snapshot. 
Accordingly, it is desirable to share multicolumn indexes for 
the column combinations if possible, thereby avoiding 
unnecessary index maintenance costs. 

Use of the present invention to determine the minimum 
number of indexes to built that can cover the anticipated 
column combinations in the subqueries of a subquery snap- 
shot advantageously reduces index maintenance costs by 
eliminating the unnecessary indexes. As described herein 
above with respect to the working example, one minimal set 
of indexes for the family of column combinations {a}, {c}, 
{b, c}, {a, b, c}, and {a, b, c, d} includes an index built on 
column a and a multi-column index built upon columns c, b, 
a, and d. Consequently, only two indexes need be 
maintained, not five indexes according to one conventional 
approach nor even three indexes according to a use of a 
depth-first search to find an initial spanning tree on an 
equivalent directed, acyclic graph. 

In the preceding description, the term "column" has been 
used to refer to columns of relational database tables. 
However, the term more generally applies to fields into 
which records from a body of data are organized. For 
example, in object oriented environments, attributes of 
object classes act as columns in that they divide object data 
from objects that belong to the classes into fields. Thus, the 
present invention is not limited to use with relational tables. 

In the foregoing specification, the invention has been 
described with reference to specific embodiments thereof. It 
will be apparent, however, that various modifications and 



10 



15 



25 



35 



45 



50 



60 



65 



changes may be made thereto without departing from the 
broader spirit and scope of the invention. The specification 
and drawings are, accordingly, to be regarded in an illus- 
trative rather than a restrictive sense. 
What is claimed is: 

1. A method of creating one or more indexes for a body 
of data arranged in columns, said indexes used to support a 
plurality of query types, said query types referencing respec- 
tive combinations of one or more of said columns, said 
method comprising the computer-implemented steps of: 

building a graph based on the plurality of combinations of 

one or more of said columns of said body of data; 
finding a minimum leaf spanning tree for the graph; and 
creating said one or more indexes based on the minimum 
leaf spanning tree. 

2. The method of claim 1, wherein: 

the step of building a graph includes the step of building 
the graph having a plurality of nodes corresponding 
respectively to the plurality of combinations of one or 
more of said columns; and 

the step of creating said one or more indexes includes the 
steps of: 

selecting one or more combinations of one or more 
columns, said combinations of one or more columns 
corresponding to leaf nodes of the minimum leaf 
spanning tree; and 

creating said one or more indexes based on the one or 
more selected combinations of one or more columns, 
respectively. 

3. The method of claim 2, wherein the step of building the 
graph having a plurality of nodes corresponding respectively 
to the plurality of combinations of one or more of said 
columns includes the step of adding an edge directed from 
a first node to a second node, wherein: 

the first node corresponds to a first combination of one or 

more of said columns; 
the second node corresponds to a second combination of 

columns; and 

the first combination of columns is a subset of the second 
combination of columns. 

4. The method of claim 2, wherein the step of finding a 
minimum leaf spanning tree for the graph includes the 
computer-implemented steps of: 

(a) finding an initial spanning tree for the graph; 

(b) establishing the initial spanning tree as a current 
spanning tree; 

(c) determining whether an augmenting path exists for the 
graph and the current spanning tree; 

(d) if the augmenting path exists, then determining a new 
spanning tree, having fewer leaves than the current 
spanning tree, based on the augmenting path, the cur- 
rent spanning tree, and the graph and establishing the 
new spanning tree as the current spanning tree; 

(e) repeating steps (c) and (d) until no augmenting path 
exists for the graph and the current spanning tree; and 

(f) establishing the current spanning tree as the minimum 
leaf spanning tree. 

5. The method of claim 4, wherein the step of finding an 
initial spanning tree for the graph includes the step of finding 
the initial spanning tree by a depth-first search. 

6. The method of claim 4, wherein the step of determining 
whether an augmenting path exists for the graph and the 
current spanning tree includes the steps of: 

(1) establishing a leaf node in the current spanning tree as 
a current leaf node; 



04/01/2003, EAST Version: 1.03.0002 



6,105,1 

21 

(2) determining whether an augmenting path starting from 
the current leaf node exists for the graph and the current 
spanning tree; and 

(3) repeating steps (1) and (2) until (3a) the augmenting 
path starting from the current leaf node exists or (3b) all 5 
leaf nodes in the current spanning tree have been 
considered. 

7. The method of claim 6, wherein the step of determining 
whether an augmenting path starting from the current leaf 
node exists for the graph and the current spanning tree 10 
includes the steps of: 

(i) establishing the current leaf node as a current node; 

(ii) determining whether there exist a first edge in the 
graph but not in the current spanning tree directed from 15 
the current node to a first node and a second edge in the 
current spanning tree directed from a second node to 
the first node, wherein the second node is not already 
part of the augmenting path starting from the current 
leaf node; 20 

(iii) if neither said first edge nor said second edge exists 
for the current node, then establishing that the aug- 
menting path starting from the current leaf node does 
not exist; 

(iv) if both said first edge and said second edge exist for 25 
the current node, then determining whether there exists 

a third edge in the current spanning tree directed from 
the second node to a third node different from the first 
node; 

(v) if both said first edge and said second edge exist for 30 
the current node and the third edge exists, then estab- 
lishing that the augmenting path starting from the 
current leaf node including the first node, the second 
node, and the third node exists; 

(vi) if both said first edge and said second edge exist for 35 
the current node and the third edge does not exist, then 
determining whether a continuation of the augmenting 
path starting from the second node exists for the graph 
and the current spanning tree; and 

(vii) if the continuation of the augmenting path exists, 40 
then establishing that the augmenting path starting from 
the current leaf node including the first node, the 
second node, and the continuation of the augmenting 
path exists. 

8. The method of claim 2, wherein the step of building the 45 
graph having a plurality of nodes corresponding respectively 

to the plurality of combinations of one or more of said 
columns includes the step of building the graph with a root 
node corresponding to an empty combination and a plurality 
of edges directed from the root node to the plurality of 50 
nodes, respectively. 

9. The method of claim 1, wherein the step of building a 
graph based on the plurality of the combinations of one or 
more of said columns of said body of data includes the step 

of building the graph based on a snapshot definition query. 55 

10. A method of finding a minimum leaf spanning tree for 
a directed acyclic graph (DAG), said method comprising the 
computer-implemented steps of: 

(a) finding an initial spanning tree for the DAG; 6Q 

(b) establishing the initial spanning tree as a current 
spanning tree; 

(c) determining whether an augmenting path exists for the 
DAG and the current spanning tree; 

(d) if the augmenting path exists, then determining a new 65 
spanning tree, having fewer leaves than the current 
spanning tree, based on the augmenting path, the cur- 



22 

rent spanning tree, and the graph and establishing the 
new spanning tree as the current spanning tree; 

(e) repeating steps (c) and (d) until no augmenting path 
exists for the DAG and the current spanning tree; and 

(f) establishing the current spanning tree as the minimum 
leaf spanning tree. 

11. The method of claim 10, wherein the step of finding 
an initial spanning tree for the DAG includes the step of 
finding the initial spanning tree by a depth-first search. 

12. The method of claim 10, wherein the step of deter- 
mining whether an augmenting path exists for the DAG and 
the current spanning tree includes the steps of: 

(1) establishing a leaf node in the current spanning tree as 
a current leaf node; 

(2) determining whether an augmenting path starting from 
the current leaf node exists for the DAG and the current 
spanning tree; and 

(3) repeating steps (1) and (2) until (3a) the augmenting 
path starting from the current leaf node exists or (3b) all 
leaf nodes in the current spanning tree have been 
considered. 

13. The method of claim 12, wherein the step of deter- 
mining whether an augmenting path starting from the cur- 
rent leaf node exists for the DAG and the current spanning 
tree includes the steps of: 

(i) establishing the current leaf node as a current node; 

(ii) determining whether there exist a first edge in the 
DAG but not in the current spanning tree directed from 
the current node to a first node and a second edge in the 
current spanning tree directed from a second node to 
the first node, wherein the second node is not already 
part of the augmenting path starting from the current 
leaf node; 

(iii) if neither said first edge nor said second edge exists 
for the current node, then establishing that the aug- 
menting path starting from the current leaf node does 
not exist; 

(iv) if both said first edge and said second edge exist for 
the current node, then determining whether there exists 
a third edge in the current spanning tree directed from 
the second node to a third node different from the first 
node; 

(v) if both said first edge and said second edge exist for 
the current node and the third edge exists, then estab- 
lishing that the augmenting path starting from the 
current leaf node including the first node, the second 
node, and the third node exists; 

(vi) if both said first edge and said second edge exist for 
the current node and the third edge does not exist, then 
determining whether a continuation of the augmenting 
path starting from the second node exists for the DAG 
and the current spanning tree; and 

(vii) if the continuation of the augmenting path exists, 
then establishing that the augmenting path starting from 
the current leaf node including the first node, the 
second node, and the continuation of the augmenting 
path exists. 

14. A computer-readable medium bearing instructions for 
creating one or more indexes for a body of data arranged in 
columns, said indexes used to support a plurality of query 
types, said query types referencing respective combinations 
of one or more of said columns, said instructions arranged 
to cause one or more processors to perform the steps of: 

building a graph based on the plurality of combinations of 
one or more of said columns of said body of data; 



04/01/2003, EAST Version: 1.03.0002 



6,105, 

23 

finding a minimum leaf spanning tree for the graph; and 
creating said one or more indexes based on the minimum 
leaf spanning tree. 

15. The computer-readable medium of claim 14, wherein: 
the step of building a graph includes the step of building 5 

the graph having a plurality of nodes corresponding 
respectively to the plurality of combinations of one or 
more of said columns; and 
the step of creating said one or more indexes includes the 1Q 
steps of: 

selecting one or more combinations of one or more 
columns, said combinations of one or more columns 
corresponding to leaf nodes of the minimum leaf 
spanning tree; and 15 

creating said one or more indexes based on the one or 
more selected combinations of one or more columns, 
respectively. 

16. The computer-readable medium of claim 15, wherein 
the step of building the graph having a plurality of nodes 2Q 
corresponding respectively to the plurality of combinations 

of one or more of said columns includes the step of adding 
an edge directed from a first node to a second node, wherein: 
the first node corresponds to a first combination of one or 

more of said columns; 25 
the second node corresponds to a second combination of 
columns; and 

the first combination of columns is a subset of the second 
combination of columns. 

17. The computer-readable medium of claim 15, wherein 30 
the step of finding a minimum leaf spanning tree for the 
graph includes the computer-implemented steps of: 

(a) finding an initial spanning tree for the graph; 

(b) establishing the initial spanning tree as a current 35 
spanning tree; 

(c) determining whether an augmenting path exists for the 
graph and the current spanning tree; 

(d) if the augmenting path exists, then determining a new 
spanning tree, having fewer leaves than the current 40 
spanning tree, based on the augmenting path, the cur- 
rent spanning tree, and the graph and estab fishing the 
new spanning tree as the current spanning tree; 

(e) repeating steps (c) and (d) until no augmenting path 
exists for the graph and the current spanning tree; and 45 

(f) establishing the current spanning tree as the minimum 
leaf spanning tree. 

18. The computer- readable medium of claim 17, wherein 
the step of finding an initial spanning tree for the graph 
includes the step of finding the initial spanning tree by a 50 
depth-first search. 

19. The computer- readable medium of claim 17, wherein 
the step of determining whether an augmenting path exists 
for the graph and the current spanning tree includes the steps 
of: 55 

(1) establishing a leaf node in the current spanning tree as 
a current leaf node; 

(2) determining whether an augmenting path starting from 
the current leaf node exists for the graph and the current 60 
spanning tree; and 

(3) repeating steps (1) and (2) until (3a) the augmenting 
path starting from the current leaf node exists or (3b) all 
leaf nodes in the current spanning tree have been 
considered. 65 

20. The computer- readable medium of claim 19, wherein 
the step of determining whether an augmenting path starting 



018 

24 

from the current leaf node exists for the graph and the 
current spanning tree includes the steps of: 

(i) establishing the current leaf node as a current node; 

(ii) determining whether there exist a first edge in the 
graph but not in the current spanning tree directed from 
the current node to a first node and a second edge in the 
current spanning tree directed from a second node to 
the first node, wherein the second node is not already 
part of the augmenting path starting from the current 
leaf node; 

(iii) if neither said first edge nor said second edge exists 
for the current node, then establishing that the aug- 
menting path starting from the current leaf node does 
not exist; 

(iv) if both said first edge and said second edge exist for 
the current node, then determining whether there exists 
a third edge in the current spanning tree directed from 
the second node to a third node different from the first 
node; 

(v) if both said first edge and said second edge exist for 
the current node and the third edge exists, then estab- 
lishing that the augmenting path starting from the 
current leaf node including the first node, the second 
node, and the third node exists; 

(vi) if both said first edge and said second edge exist for 
the current node and the third edge does not exist, then 
determining whether a continuation of the augmenting 
path starting from the second node exists for the graph 
and the current spanning tree; and 

(vii) if the continuation of the augmenting path exists, 
then establishing that the augmenting path starting from 
the current leaf node including the first node, the 
second node, and the continuation of the augmenting 
path exists. 

21. The computer-readable medium of claim 15, wherein 
the step of building the graph having a plurality of nodes 
corresponding respectively to the plurality of combinations 
of one or more of said columns includes the step of building 
the graph with a root node corresponding to an empty 
combination and a plurality of edges directed from the root 
node to the plurality of nodes, respectively. 

22. The computer-readable medium of claim 14, wherein 
the step of building a graph based on the plurality of the 
combinations of one or more of said columns of said body 
of data includes the step of building the graph based on a 
snapshot definition query. 

23. A computer-readable medium bearing instructions for 
finding a minimum leaf spanning tree for a directed acyclic 
graph (DAG), said instructions arranged to cause one or 
more processors to perform the steps of: 

(a) finding an initial spanning tree for the DAG; 

(b) establishing the initial spanning tree as a current 
spanning tree; 

(c) determining whether an augmenting path exists for the 
DAG and the current spanning tree; 

(d) if the augmenting path exists, then determining a new 
spanning tree, having fewer leaves than the current 
spanning tree, based on the augmenting path, the cur- 
rent spanning tree, and the graph and establishing the 
new spanning tree as the current spanning tree; 

(e) repeating steps (c) and (d) until no augmenting path 
exists for the DAG and the current spanning tree; and 

(f) establishing the current spanning tree as the minimum 
leaf spanning tree. 

24. The computer-readable medium of claim 23, wherein 
the step of finding an initial spanning tree for the DAG 



04/01/2003, EAST Version: 1.03.0002 



6,105, 

25 

includes the step of finding the initial spanning tree by a 
depth-first search. 

25. The computer- re ad able medium of claim 23, wherein 
the step of determining whether an augmenting path exists 
for the DAG and the current spanning tree includes the steps 5 
of: 

(1) establishing a leaf node in the current spanning tree as 
a current leaf node; 

(2) determining whether an augmenting path starting from 
the current leaf node exists for the DAG and the current 10 
spanning tree; and 

(3) repeating steps (1) and (2) until (3a) the augmenting 
path starting from the current leaf node exists or (3b) all 
leaf nodes in the current spanning tree have been 
considered. 

26. The computer-readable medium of claim 25, wherein 
the step of determining whether an augmenting path starting 
from the current leaf node exists for the DAG and the current 
spanning tree includes the steps of: 2Q 

(i) establishing the current leaf node as a current node; 

(ii) determining whether there exist a first edge in the 
DAG but not in the current spanning tree directed from 
the current node to a first node and a second edge in the 
current spanning tree directed from a second node to 25 
the first node, wherein the second node is not already 
part of the augmenting path starting from the current 
leaf node; 



018 

26 

(iii) if neither said first edge nor said second edge exists 
for the current node, then establishing that the aug- 
menting path starting from the current leaf node does 
not exist; 

(iv) if both said first edge and said second edge exist for 
the current node, then determining whether there exists 
a third edge in the current spanning tree directed from 
the second node to a third node different from the first 
node; 

(v) if both said first edge and said second edge exist for 
the current node and the third edge exists, then estab- 
lishing that the augmenting path starting from the 
current leaf node including the first node, the second 
node, and the third node exists; 

(vi) if both said first edge and said second edge exist for 
the current node and the third edge does not exist, then 
determining whether a continuation of the augmenting 
path starting from the second node exists for the DAG 
and the current spanning tree; and 

(vii) if the continuation of the augmenting path exists, 
then establishing that the augmenting path starting from 
the current leaf node including the first node, the 
second node, and the continuation of the augmenting 
path exists. 



04/01/2003, EAST Version: 1.03.0002 



United States Patent [19] 

Rathbuii 



US006138123A 
[ii] Patent Number: 
[45] Date of Patent: 



6,138,123 
*Oct 24, 2000 



[54] METHOD FOR CREATING AND USING 
PARALLEL DATA STRUCTURES 

[76] Inventor: Kyle R. Rathbun, 2357 Stonehedge 

Dr., Apt. E, East Lansing, Mich. 48823 

[ * ] Notice: This patent issued on a continued pros- 
ecution application filed under 37 CFR 
1.53(d), and is subject to the twenty year 
patent term provisions of 35 U.S.C 
154(a)(2). 



[21] Appl. No.: 08/892,705 
[22] Filed: Jul. 15, 1997 




Related U.S. Application Data 

[60] Provisional application No. 60/023,340, Jul. 25, 1996, and 
provisional application No. 60/022,616, Jul. 26, 1996. 

[51] IntCI. 7 G06F 17/30 

[52] U.S. CI 707/201; 707/102; 345/339; 

345/800 

[58] Field of Search 707/102, 201, 

707/5, 104, 2, 3, 4, 101, 531, 7, 8, 10; 
395/800, 200; 364/231, 490, 468; 455/456; 
370/381, 389, 256, 406; 711/129,. 153, 
173, 206; 345/339, 349, 440, 800; 358/1, 

18 

[56] References Cited 

U.S. PATENT DOCUMENTS 



5,230,047 7/1993 Frey, Jr. 395/182 

5,319,778 6/1994 Catino 707/102 

5,430,869 7/1995 Ishak et al 707/101 

5,475,837 12/1995 Ishak et al 707/101 

5,475,851 12/1995 Kodosky et al 345/339 

5,535,408 7/1996 Hillis 345/800 

5,539,922 7/1996 Wang 455/456 

5,551,027 8/1996 Choy et al 707/201 

5,602,754 2/1997 Beatty et al 364/489 

5,608,903 3/1997 Prasad et al 707/10 



OTHER PUBLICATIONS 

R.G. Gallager et al., "A Distributed Algorithm for Mini- 
mum-Weight Spanning Trees", Jan. 1983, ACM Transac- 



tions on Programming Languages and Systems, vol. 5, No. 
1, pp. 66-77, 

Richard Weinberg, "Parallel Processing Image Synthesis 
and Anti-Aliasing", Aug. 1981, Computer Graphics, vol. 
15, No. 3, pp. 55-62. 

Shmuel Zaks, " Optimal Distributed Algorithms for Sorting 
and Ranking", Apr. 1985, IEEE Transactions on Computers, 
vol. C-34, No. 4, pp. 376-379. 

Clyde P. Kruskal, "Searching, Merging, and Sorting in 

Parallel Computation", Oct. 1983, IEEE Transactions on 

Computers, vol. C-32, No. 10, pp. 942-946. 

Carla Schlatter Ellis, "Distributed Data Structures: A Case 

Study", May 1985, The 5th International Conference on 

Distributed Computing Systems, IEEE Computer Society, 

Computer Society Press, pp. 201-208. 

Ossama L El-Dessouki et al., "Distributed Search of Game 

Trees", May 1984, The 4th International Conference on 

Distributed Computing Systems y IEEE Computer Society, 

Computer Society Press, pp. 183-191. 

(List continued on next page.) 

Primary Examiner — Wayne Amsbury 

Assistant Examiner — Thu-Thao Havan 

Attorney, Agent, or Firm — Harness, Dickey & Pierce, P.L.C 

[57] ABSTRACT 

Parallel data-structures distribute a given data set to system 
components by grouping the data set according to ranges. 
These ranges are sub-divided for distribution into parallel 
form. A given data value is located by its placement within 
an appropriate range; the ranges are located by their rela- 
tionships to each other and the data set as a whole; thus, the 
ranges are related to each other, the order of the data set is 
maintained and access is gained to the data set by range. 
Each range may be distributed to multiple nodes; each node 
may be contained in a separate data-structure; each separate 
data-structure may be maintained on a separate system 
component. The result is a method of rr™t\ng ^ u^'ng 
parallel data -structures th,at mav take a wide variety of forms 
aod be used to control data distribution and the efficien t 
distribution of system resources. 

36 Claims, 68 Drawing Sheets 



Compotil* Global Mod* 
Bjwd on Rintt (»-dl 



Implicit Ctot»l Link 
!^r"B«cd on Common 




04/01/2003, EAST Version: 1.03.0002 



6,138,123 

Page 2 



OTHER PUBLICATIONS 

Raphael Finkel and Udi Manber, "DIB — A Distributed 
Implementation of Backtracking", May 1985, The 5th Inter- 
national Conference on Distributed Computing Systems, 
IEEE Computer Society, Computer Society Press, pp. 
446-^52. 

W. Daniel Hillis and Guy L. Steele, Jr., "Data Parallel 
Algorithms", Dec. 1986, Communications of the ACM, vol. 
29, No. 12, pp. 1170-1183. 

Jishnu Mukerji and Richard B. Kieburtz, "A Distributed File 
System for a Hierarchical Multicomputer", Oct. 1979, The 
1st International Conference on Distributed Computing 
Systems, IEEE Computer Society, Catalog No. 79CH1445-6 
C, pp. 448^157. 

Keki B. Irani et al., "A Combined Communication Network 
Design and File Allocation for Distributed Databases", Apr. 
1981, The 2nd International Conference on Distributed 
Computing Systems, IEEE Catalog No. 81CH1591-7, Com- 
puter Society Press, pp. 197-210. 

Bruce Lindsay, "Object Naming and Catalog Management 
for a Distributed Database Manager", Apr. 1981, The 2nd 
International Conference on Distributed Computing Sys- 
tems, IEEE Catalog No. 81CH1591-7, Computer Society 
Press, pp. 31-40. 



Ajay K. Gupta et al., "Load Balanced Priority Queues on 
Distributed Memory Machines", Western Michigan Univer- 
sity Research, Fellowship from the Faculty Research and 
Creative Activities Support Funds, WMU-FRCASF 90-15 
and WMU-FRACASF 94-040 and National Science Foun- 
dation, Grant No. USE-90-52346. 

Elise de Doncker et al., "Two Methods for Load Balanced 
Distributed Adaptive Integration", Department Computer 
Science, Western Michigan University, National Science 
Foundation, Grant No. CCR-9405377. 

Elise de Doncker et al., "Use of Parlnt for Parallel Compu- 
tation of Statistics Integrals", Department Computer Sci- 
ence, Western Michigan University, National Science Foun- 
dation, Grant Nos. CCR-9405377 and DMS-9211640. 

Elise de Doncker et al., "Development of a Parallel and 
Distributed Integration Package — Part I", Department Com- 
puter Science, Western Michigan University, National Sci- 
ence Foundation, Grant No. CCR-9405377. 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct 24, 2000 Sheet 1 of 68 6,138,123 



Single Processor 



15 30 45 




4 5 12 



20 26 



33 40 k 



52 60 



Figure l 



04/01/2003, EAST Version: 1.03-0002 



U.S. Patent 



Oct 24, 2000 



Sheet 2 of 68 



6,138,123 




04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct 24, 2000 Sheet 3 of 68 



6,138,123 



Processor PI 



A1 



GD 

(40-45 ) 



B1 



GDCUD 

tmin-20 )( 21-39 )h 




Fig 
3 



Processor P2 



B2 




Gl) C 30,33 ) 
(min-20 K 21-39 ) - 



Figure 3 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct 24, 2000 Sheet 4 of 68 6,138,123 




Figure 4 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct 24, 2000 



Sheet 5 of 68 



6,138,123 




Figure 5 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct 24, 2000 



Sheet 6 of 68 



6,138,123 




Figure 6 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct 24, 2000 Sheet 7 of 68 



6,138,123 



Single Processor 




80 



Figure 7 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct 24, 2000 Sheet 8 of 68 6,138,123 




Figure 8 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct 24, 2000 



Sheet 9 of 68 



6,138,123 




Figure 9 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 10 of 68 6,138,123 




Figure 10 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 11 of 68 6,138,123 



Processor PI 



GD 

(4049 ) 





GD 




m / 


120-29 I 


El 


. ( « ) 




( 30 ) 


(min-19 J 




' 30-39 1 



XI- 



GD 

( 50-59 * 



-El 



« C 70 ) 
t 60-max 5 



Fig 
11 



Processor P2 



-fiL 



GD 

t 4049 ) 



t 20-29 J 



£2. 



(ss.ss) 
t 50-59 I 



G*J 




GD 




(74.75) 


I mtft-19 J 




t 30-39 I 







Processor P3 





( 20-29 ) 


X n 


t mirv-19 1 




.GD 

( 30-39 J 



GD 

I 4049 ) 



Figure 11 



GD 

C 50-59 ) 



-El 



GD) 

1 60-max 1 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 12 of 68 



6,138,123 



Processor PI 



A1 





B1 ■ 






01 


,GD. 

t 20-29 I 


.GD 

\ < 50-59 J 


\ F1 


. ( H ) . 

lmin-19 > 




. GD 

( 30-39 > 







Processor P2 



12 




. ( 15 ) 




( 34 ) 


f min-19 ) 




t 30-39 J 



(21*25) 

1 60-max J 



Processor P3 



A3 
GD 

[40-49 > 



C7D 



( 18 ) 






Cmm-19 1 




I 30-39 > 



GD 

t 50-59 » 



^ fa , 

GD 

( 60-max J 



Figure 12 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 13 of 68 6,138,123 



Processor PI 





,GD, 

( 20-29 J 


X F1 


cm 




. GD 

( 30-39 ) 



GD 

t 40-49 J 



t 50-59 > 



-El 



< 60-max 



Processor P2 



fig 
13 



J2_ 



j&2_ 



GD 

(4049 J 



( » ) 


► * 


>^ E2 

GD 


f mirv-19 > 




( 30-39 ) 



( 50-59 » 



-El 



I 60-max I 



Processor P3 





, GD, 

(20-29 ) 




(min-19 ) 




.GD 

( 30-39 1 



GD 

(4CM9 » 



Figure 13 



£1. 



GD 

t 50-59 ) 



-a 



JLfi. 
( 60-maa 1 



04/01/2003, EAST Version: 1*03.0002 



U.S. Patent Oct. 24, 2000 Sheet 14 of 68 6,138, 



Processor PI 



S1 



GD 

t4(M9 ) 



GD 



D1 S 
. C 10 ) 




El 

( »«, ) 


(mfn-19 J 




( 30-39 J 



GD 

f 50-59 J 



-El 



Gaits) 

( 60-max ) ' 



14 



Processor P2 



GD 





t 20-29 1 




( H ). 

tmin-19 ) 




.GD 

I 30-39 J 



70 



_C2_ 



( 50-59 I 



Go. 74^ 
C 60-max 3 




Figure 14 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 15 of 68 



6,138 





( 60-max ) 



(60-max) 



60 
65 
70 



} 

Is} 



(60-70) 



(71-max) 



60' 

65''' 

70-'"" 

74 

75 \ 
78 \ ^ 



GDI 

1 60-70 ) 



GDI 

(60-70 1 



GD 

[ 60-70 ) 



JGD 

( 71-max ) 



J GD 

( 71-max ) 



GD 

( 71-max ) 



Figure is 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 16 of 68 6,138,123 



Processor PI 



JUL 



B1 



AL 



C7D 



.GD K 







GD 


(mirv19 * 




130-39 ) 



XL 



GD 

( 50-59 1 



61 



-EL 



GD 

I 71-max 1 



GD 

( 60-70 ) 



Processor P2 



16 



H2_ 



JUL 



4L. 



( 15 ) 
t 40-49 ) 



(24.2s) 
t 20-29 ) 



( is ) 




GO 


( min-19 1 




(30-39 ) 



_C2_ 



(s5,Sfi) 
t 50-59 J 



£2_ 



GDI 

t 71-max ) 



GD 

( 60-70 ) 



Processor P3 



AL 



GD 

( 4049 > 



GD 

t 20-29 ) 



. ( » ) 




GD 


(min-19 ) 




C 30-39 ) 



Figure 16 



X3_ 



GD 

( 50-59 1 



_Q1 

GD 

1 60-70 ) 



_£1 



GD) 

( 71-max ) 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 17 of 68 



6,138,123 



Processor PI 



B1 



GD 

I 40-49 > 



.GD. 
D1 ^liga^Va. 



GD 

I 30-39 1 



CI ^ 

GD 
C50-S9 1 



G1_.. , 

GD 

* 60-70 J 



F1 , 

GD 

171-max) 



Processor P2 



R0. 

17 



J22L 



£2- 



Ji2_ 



GD 

I 4049 * 



120-29 ) 



Cmin-19 ) 



GD 

^ $0-70 » 



-£2_ 



GD 

I 30-39 ) 



1 55.56) 
( 50-59 ) 



GD 

1 71-man J 



Processor P3 



J2L. 



GD 

1 20-29 ) 



A3 

GD 
(40-49 ] 



_G3_ 



-EI. 



GD 
160-70 i 



CUD 

lmin.19 ) 



GD 

(30-39 ) 



GD. 

I 50-59 ) 



GD 

<7t-max) 



Figure 17 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 18 of 68 6,138, 




Fi9 
18 



Processor P2 



£2. 



C7D 

UO-49 ) 



G4,2S) 
( 20-29 ) 



J32. 



t 50-70 J 



D2 ^ 

( » ) 




\ E2 
( 31 ) 




Gs,56) 






(min-19 > 




I 30-39 3 




C 50-59 ) 




I 71-ma* ) 



Processor P3 



HI. 



-BL. 



j*2_ 



GD 

(«M9 J 



GD 

( 20-29 * 



CUD 

1 min-19 J 



_E1_ 



GD 



£2_ 



GD 

( 60-70 J 



GD 



t 30-39 ) C 50-59 » 



Figure 18 



.£1 



(ta, as) 
<71-maxl 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 19 of 68 



6,138,123 



Processor PI 



.01- 



& 

f 20-29 ) 



GZ3 

( min-19 ) 




< 30-39 J 



F1 



(74.75)) 
I 71-max T w 



75 



Processor P2 



Fi 9-I 
19 




(40-49 3 



( min-19 1 



> 30-39 




80 



( 50-59 > 



78 



Processor P3 



< min-19 1 




< 30-39 J 



GD 

I 50-59 > 



89 



Castas) . 

1 71-max 1 



Figure 19 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 20 of 68 



6,138 



G±3D 

(71-max ) 



(78, bo) 
( 71-max ) 



(85,8$) > y 

( 71-max ) 



"(71-max) 



r 74 

75 
78 

80 
85 
89 



} 
} 



(60-78) 



(79-max) 



74- ' 

75- -' 
78'" 

80*- 
8S'*> 
89 \ 



J GDI 

(60-78 ) 



GDI 

(60-78 ) 



GDI 

(60-78 ) 



J ( «0 ) . 
( 79-max ) 



GDI 

(79-max ) 



GDI 

(79-max ) 



Figure 20 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 21 of 68 



6,138,123 



Processor PI 



J2L. 



GD 

( 20-29 3 



GD 

(40-49 ) 



GD 

(min-19 ) 



-El 



GD 

( 30-39 J 



CI 



.fiL 



GD 

(6O>70 J 



GD 

(50-59 ) 



JEL 



J C 74 ) 
£ 71-78 J 



GD 

(79-max) 



Processor P2 



Fi 9j 
21 



B2 



GD 

(4(M9 1 



G4.2s) 
I 20-29 ) 



GD 



us ^ 
. ( 15 ) 




(m ) 




Li! ^ 

(|55 f se) 




( 75 ) 


( min-19 ) 




( 30-39 ) 




( 50-59 > 


• 4 


(71-78 J 



GD 

( 7S-max 1 



Processor P3 



83 



GD 

< 40-49 J 



GD 



GD 



D3 ^ 

. ( IB ) 




E3 

GD 




C3 

GD 




\ F3 

( 78 ) 


f min-19 J 




I 30-39 > 




( 50-59 ) 




(71-78 1 



H3 



GD 

(79-max) 



Figure 21 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 22 of 68 6,138,123 




Processor P2 



GD 

(4(H9 ) 



« / 


( 20-29 J 


\ „ 




GlsJ . 

(60-70 ) 


\ „ 


. < 11 ) . 

tmirv19 J 




( 30-39 ) 




(55,55) 

( 50-59 ) 




. ( 75 ) 

(71-78 * 



JO. 



C 79-max ) 



Processor P3 




Figure 22 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 23 of 68 



6,138,123 



Processor PI 



81 



A1 



GD 

(4049 ) 



GD 



GD 



. ( » ) 




( » ) 




GD 




C n ) 


tmb-19 ) 




t 30-39 ) 




(50-59 1 


1 4 


t 71-7B 1 



JiL 



(ao,8s) 
( 79-max i 



Processor P2 



23 



-D2_ 



4L 



85 



GD 

(4049 ) 



( 20-29 ) 



£2. 



. ( 15 ) 




GD 




(ss.se) 




( 75 ) 


( min-19 J 




( 30-39 ) 




( 50-59 ) 




(7178 1 



(e?,?o) 



Processor P3 



D3 



GD 

(4049 > 



89 
90 



GD 

( 20-29 ) 



GD) 

(min-19 ) 



-El 



(30.39 1 



SSL 



G3 



GD 

( 60-70 ) 



GD 

( 50-59 ) 



GD 

(71-78 I 



95 



98 



(79-max) 



Figure 23 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 24 of 68 



6,138 



(aces) t 
[79-max ) 



( 79-max ) 



( 79-max ) 



(79-max) < 



80 
85 
89 

90 
95 
98 



} 



(7949) 



(90-max) 



80' 

85''' 

89-" 



J GDI 

(79-89 ) 



GDI 

( 79-89 ) 



GDI 

( 79-89 ) 



90- 

95 \ 
98 \ \ 



GD 

( 90-max ) 



GDI 

( 90-max ) 



GDI 

( 90-max ) 



Figure 24 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 25 of 68 6,138,123 



processor PI 



3L 



GD 

C4049 ) 



GD 



01 ^ .. 

( xo ) 




GD 




GD 


(nnn-19 ) 








« 50-59 ) 



J5L 

GD 

< 60-70 ) 



^ F1 

GD 
t 71-78 ) 



^ H1 

GD 

< 79-89 1 



GD 

l90-m») 



25 



Processor P2 



D2_ 



£2- 



GD 
(4<M9 > 



GD 

( 20-29 1 



CTo) 

I min-19 1 



GD 

(30-39 ) 



C2 


GD 

1 60-70 ) 


~ " 






(55,55) 




GD 

1 71-7B ) 


\ H2 




( 50-59 ) 






GD 

(7^89 ) 




















. GD 

(90-m<u> 



Processor P3 



_B1_ 



GD 
(4049 ) 



GD 



D3 S 

( » ) 




> F3 

GD 




GD 


(mtn-19 ) 




(30-39 J 




( 50-59 J 



^ G3 

GD 

( 60-70 ) 



-El 



GD 

( 71-76 ) 



H3 



GD 

' 79-69 ) 



GD 

(90-max) 



Figure 25 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 26 of 68 6,138, 



Processor PI 



GD 

(4(M9 ) 



CLD 

<nun-19 J 



GD 

' 30-39 I 



GD 

(50-59 ) 



(60-70 ) 






r, / 


( » ) 

179-89 ) 








( *o ) 


( 71-78 ) 




(90-max) 



26 



Processor P2 



H2. 



a? 



4L 



GD 



G4.2s) 
< 20-29 > 



GZ)[ 

I min-19 ) 



P2_ 



GD 

( 30-39 1 



Ja2 

GD 

I 60-70 ) 



( 50-59 ) 



J1Z. 



GID 

( 79-89 J 







( » ). 


t 71-78 ) 




( 90-max J 



Processor P3 



B3 



GD 

( 4049 ) 



GD 

1 20-29 I 



GO 

I min-19 J 



_E1_ 



GD 
(30-39 ) 



£2- 



£1. 



GD 

( 60-70 > 



GD 

t 50-59 3 



-HI 



GD 

( 79-89 1 



( 78 ) 




\ 13 
( 98 ) 


£ 71-78 ) 


* « 


( 90- max ) 



Figure 26 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 27 of 68 



6,138,123 



Processor PI 





R1 ■ 


A1 

GD 

U(M9 ) 








,GD 




,GD 

1 60-70 1 


\ « 


( u ). 

Imirv19 ) 




.GO 

C 30-39 I 




GD. 

( 50-59 J 




• w 



GD 

I 71-78 » 



(90-max) 



Processor P2 



27 



-02- 



_B2_ 



jfc2_ 



GD 

(40-49 > 



GD) 

I min-19 • 



t 20-29 I F , 



GD 

( 30-39 1 



£2, 



fi? 

CD 

(60-70 ) 



GD 

I 50-59 ) 



•-K5S"; 



-U2_ 



GD 

(79-89 ) 



GD 

( 71-78 > 



GD 

190-maxi 



Processor P3 



JQL. 



GD 



GD 

( 20-29 > 



GD) 

(min-19 J 



-EL 



GD 
( 30-39 ) 



XL. 



G3 

CD 

I 60-70 ) 



GD 

( 50-59 ) 



GD) 

( 79-89 > 



•K70) 



GD 




\ a 
( 9B ) 


( 71-78 ) 




(9(Miai ) 



Figure 27 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 28 of 68 



6,138,123 



Processor PI 



J1L 



GD 

I 4049 1 



GD 

( 20-29 > 



GD) 

*mirv19 * 



-El 



JGD 

( 30-39 ) 



£1_ 



CD 

( 60-ro i 



GD 

I 50-59 ) 



-HI. 



r l 

CD 




( 90 ) 


C 71-78 ) 




(90-max) 



Processor P2 



R9- 
28 



J32_ 



A? 



GD 

U(U9 ) 



( 20-29 > \ 



£2- 



GD 

* min-19 * 



GD 

( 30-39 ) 



.CD. 

^^^ X H ? 



GD 

< 50-59 J 



GD 

C 71-78 ) 



( 90-max ) 



Processor P3 



_D1_ 



J2_ 



GD 

U(M9 ) 



GD 

( 20-29 ) 



GDI 

I min-19 > 



-El 



JGD 

< 30-39 ) 



C3 



GD 

t 50-59 ) 





HI 






GD 






( 79-69 > 




GD 


* « 


( 96 ) 


(71-78 J 




(90-max) 



Figure 28 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 29 of 68 6,138,123 



Processor Pi 



1L 



GD 
(40-49 ) 



GD 



GD 



. ( u ) 




( 30 ) 




(so) 




( SO ) 


1 min-19 ) 




(30-39 5 




(50-59 J 




( 79-89 1 



JGD 

<90-max> 



Fig J 
29 



Processor P2 



J2_ 



GD 

I 40*49 ) 



GD 



C is ) 




GD 




(»«) 




C » ) 


( min-19 J 




t 30-39 ) 




( 50-59 ) 




I 79-69 ) 



C 90-max) 



Processor P3 



GD 





(20-29 ) 






C 18 ). 

1 min-19 > 




. GD 

< 30-39 ) 




GD 

I 50-59 ) 



.C3_ 



GD 

I 60-78 ) 



Jli 



GD 

t 79-89 1 



GD 

(90-max) 



Figure 29 



04/01/2003, EAST Version: 1.03-0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 30 of 68 



6,138,123 




04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 31 of 68 



6,138,123 



Single Processor 




Figure 31 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 32 of 68 



6,138,123 




10 



30 



50 



70 



Figure 32 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 33 of 68 6,138,123 




04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 34 of 68 



6,138,123 




Figure 34 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct.24,2000 Sheet 35 of 68 



6,138,123 




Figure 35 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 Sheet 36 of 68 



6,138,123 




Figure 36 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 37 of 68 6,138,123 




Figure 37 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 38 of 68 6,138,123 



Processor PI 



(20-29 I U<M9 J 



B1 




J 


. C 10 ) 




GD 




( min-19 ) 




(30-39 I 





.01. 



c 



J C 



JUL 



J 



< 50-59 ) ( 60-oiix ) 



Processor P2 



Fig. 
38 



min-19 ) 



(20-29 ) <4W9 ) 




C2_ 



GD 

( 30-39 ) 



SUL 



Q 2 ^ 

C SS.56 ) ( 71,75 .) 
I 50-59 ) I 60-max > 



Processor P3 



11- 



I min-19 ) 




( 50-59 ) ( 604tuu ) 



Figure 38 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 39 of 68 6,138,123 



Processor Pi 



3L 



c 



i min-19 ) 



A1 



GD GD 

(20-29 1 < 40-49 1 




GD 
( 30-39 1 




c so 




C 60.70 


) 


I 50-59 




I 60-max 


) 



Processor P2 



pig. 

39 



(2Ll25) ( 45 ) 
( 20-29 > »4049 > 



B2 






( is ; 




GD 




1 min-19 ) 




( 30-39 ) 






C 55.56 ) C 7 4,75 ) 
C 50-59 I ( 60-max > 



Processor P3 



120-29 J U(M9 3 





./ 


( 18 ) 




(38) 




( min-19 ) 




( 30-39 ) 






v 58 ^ ) C 78 ✓ 

( 50-59 J t 60-max ) 



C 



Figure 39 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 Sheet 40 of 68 



6,138,123 



F*9 
40 



Processor PI 



Processor P2 



.AL 



C 20 ) ( 40 ) 
( 20-29 ) (4049 > 



B1 ^""^ 


./ 


C 10 ) 




(ID 




( min-19 ) 




( 30-39 ) 





C SO ) Go.65.7q) 
I 50-59 » t 60-max) 



J2L 



C24.2 5) C 45 ) 
1 20-29 J (4049 ) 



02 




« / 


C 15 ) 








( min-19 ) 




( 30-39 ) 





D 2 ^ 

C 55,5$ ) C 74,75 ) 

t 50-59 J < 60-max ) 



Processor P3 



C 28 ) C 18 ) 
120-29 I [40-49 J 



B3 




C3 / 


03 


( 18 ) 




( 38 ) 






( min-19 J 




t 30-39 1 




t 50-59 > I 60-max > > 



Figure 40 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct 24, 2000 



Sheet 41 of 68 



6,138,123 



Processor PI 



JlL 



GD GD 

(20-29 > I 4049 1 



B1 




./ 


. ( io ; 




GD 




( mirv19 ) 




( 30-39 ) 





P I ^ 

C so ) C gQ,» J) 
< 50-59 > < 60-max )\ 



F| 9J 
41 



Processor P2 



( 20-29 1 (4049 1 



B2 




« 




. C 15 ) 




( 34 ) 




( min-19 1 




t 30-39 ) 





70 



( .5 .5, 5 $ ) ( 70,7* / 
I 50-59 J I 60-max )\ 



Processor P3 



_£2_ 

c 



la 



E mirv19 1 



( 29 J 




(20-29 » 


« 4049 J 



c? / , 

GD 

t 30-39 ) 



Figure 41 



raj * 

C 58 ) C 75,78 3 
( 50-59 1 f 60-max ) 



75 



04/01/2003, EAST Version: 1.03,0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 42 of 68 



6,138,123 



Processor Pi 



GD GD 

( 20-29 ) i 4049 3 



B1 ^ 




C1 


C 10 ) 








( min-19 J 


* < 


( 30-39 ) 





J1L 



I 50-59 



Processor P2 



Fi9- 
42 



t 20-29 1 E 4049 ) 



B2 ^ 












(34 ) 




( min-19 1 




( 30-39 ] 





Gi^iJC 65 )(_75 ) 
( 5049 1 ( 60-70 > 1 71-max I 



Processor P3 



£2_ 



GZ_J 

t mm-19 ) 



GD GD 

< 20-29 ) * 4049 ) 






GD 






( 30-39 » 





I so-59 > < 60-70 J < 71-max ) 



Figure 42 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 43 of 68 



6,138,123 



Processor PI 



JBL 



C 



t min-19 > * 



( 2Q )dil-X gg ) 
( 20-29 » t 4(M9 1 I 60-70 1 





.\ ^ 




C 30 ) 




( SO ) 






(30-39 1 




(50-59 1 





_E1 



c 



-Z4- 



I 71-max ' 



fig 
43 



Processor P2 



32. 



c 



JUL 



3 



f min-19 J 



C24.25) C 45 )( 65 ) 
1 20-29 J < 40-49 > < 60-70 * 





-EL 





( 34 ) 








C 75 ) 




( 30-39 ) 




( 50-59 1 




t 71-max ) 



Processor P3 



( min-19 ) 



C 28 ) ( 48 X 70 ) 
( 20-29 1 I 40-49 1 « 60-70 J 





Figure 43 



-EI. 



( 71-max I " 



04/01/2003, EAST Version: 1,03.0002 



U.S. Patent Oct. 24, 2000 Sheet 44 of 68 



6,138,123 



Processor PI 



G1 



GD 

(4049 ' 



A1 



GD 

(20-29 1 



£L 



GD 
• 60-70 I 



31. 



I mirv-19 ) 



GD 

l 30-39 ) 



Hi. 



GD 

($0-59 ) 



GD L 

t 71-max 1 * 



Processor P2 



42- 




t 20-29 > 



GD 

(60-70 > 



.EL 



I min-19 i h 



GD 

(30-39 1 



D2 



( 50-59 ) 



CUD 

I 71-max ) 



Processor P3 



J33_ 



GD 
«40-*9 • 



GD 

(20-29 J 



13_ 



GD 

( 60-70 ) 



£1. 



CUD 

( min-19 J 



C3 \ 




F3 N 




( ?e ) 




GD 




( 7* ) 




( 30-39 ' 




(50-59 ) 




( 71-max J * 



Figure 44 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 45 of 68 6,138,123 



Processor PI 




( 60-70 ) 



B1 ^ 




C1 \ . 


01 




c » ; 




GD 




C so ) 




1 min-19 3 




( 30-39 ) 




t 50-59 1 





-EL 



-74.80 



t 71-max ) 



45 



Processor P2 



j&2_ 



E 20-29 1 



B2 



c 



_1S_ 



J 



( min-19 > ' 




J 



(60-70 ) 



(ss.5&) 
'50-59 • 



£2- 



C 



78.9? 



3 



( 71-max ) 



Processor P3 




AS / 




C3 \ 


D3 






c » ; 




( >* ) 




( » ) 




C 78.85 3 


( min-19 1 




( 30-39 1 




( 50-59 ) 




f 71-max ) 



Figure 45 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 46 of 68 6,138,123 



Processor PI 




GD 

( 60*70 > 



CI \ 

( 30 ) 




(30-39 1 










Q2 




GD 




I 4049 > s 



GD 

t 50-59 ) 



c 



74.7^ 



( 71-mar, ) 



T" 



Processor P2 



75 



Fk)J 
46 



GD 

(60-70 > 







D2 


( « ; 




( 34 ) 




Css.se) 


( min-19 1 




1 30-39 ) 




( 50-59 ) 



80 



C 78.80 3 
( 71-maft 5 



•max 1 , 



Processor P3 




Figure 46 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 47 of 68 



6,138,123 



Processor PI 



GD 
( 20-29 I 




(60-70 > 





CI \ , 




m 




( » ; 




Cm ) 




c » ) 




( mirv19 3 




t 30-39 ) 




( 50-59 > 





£1. 



(71-78 J( 79-max » 



Fig J 
47 



Processor P2 




( 20-29 ) 



GD 

(60-70 > 



J2_ 



GD 

I nwv!9 J 



X2_ 



GD 

( 30-39 ) 



J22_ 



( 50-59 ) 




t 71-78 K 79-max ) 



Processor P3 



J2L 



JJfiL 



( min-19 I 




J 



( 71-78 K 79-max ) 



Figure 47 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 48 of 68 



6,138,123 



Processor PI 



G1 



GD 

<4<W9 > 



GD 

(20-29 ) 



GD 

( 60-70 ) 



fi1 ^ 


. C1 \ 


01 


( » ) 




( 30 ) 




GD 


t min-19 ) 




(30-39 > 




( 50-59 ) 



GD Cb0.98) 

I 71-78 J I 79-max > 



Fifl 
46 



Processor P2 




R? / 




C2 \ 


D? 


( » ; 




C M ) 




(ss.sg) 


( min-19 ) 


* 


( 30-39 ) 




( 50-59 ) 



_E2_ 



l 71-78 )( 79-max ) 



Processor P3 



GD 

( 40-49 I 



GD 

( 20-29 ) 



GD 

( 60-70 J 



83 ^ 
C IB ) 




, C3 \ , 
( IB ) 




I min-19 > 




(30-39 > 





D3 



GD 

( 50-59 ) 



t 71-78 )( 79-max ' 



Figure 48 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 49 of 68 



6,138,123 



Processor PI 



GD 

(«M9 I 



41. 



GD 

( 20-29 I 



fi. 



GD 

(60-70 > 



C 10 ) 








GD 


< mtfv19 ) 


* 


( 30-19 ) 




• 50-59 ) 



GD (bo^s) 

( 71-78 3( 79-max )' 
' 



49 



Processor P2 



J22_ 



B5 



GD 

I 40-49 * 



(20-29 > 



£2- 



GD 

C 60-70 ) 



B2 




C2 \ 


D2 , , 






( « ) 






f mirM9 ) 




t 30-39 ) 




C 50-59 ) 



GD GZjiZ) 

( 717$ H 7&^ax > 



Processor P3 



GD 

J 4049 J 



89 
90 



GD 

t 20-29 » 



f2L 



GD 

( 60-70 ) 



98 



GD 

I min-19 1 



£1. 



GD 
(30-39 J 



GD 
t 50-59 J 



95 



GDGZD 

(71-78 K 79-max > * 



Figure 49 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 50 of 68 



6,138,123 



Processor PI 




( it ) 




cn 

( 30 ) 




D1 

GD 




t min-19 I 




( 30-39 > 




f 50-59 ' 





C 74 X 80 ) C 90 ) 
( 71-78 J I 79-89 > C 90-max > " 



50 



Processor P2 




Processor P3 




63 ^ 
( IB ) 




C3\ 

GD 




n 

(_SfiJ 




t min-19 1 




( 30-39 ) 




150-59 J 





C 78 X 89 )( 9B ) 

< 71-78 ) I 79-89 J < 90-max I 



Figure so 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 51 of 68 



6,138,123 



Processor Pi 



GD 

(20-29 I 




f 60-70 H 79-69 > 



C min-19 ) 



GD 

( 30-39 > 



D1 



GD 

( 50-59 » 



GD 

171-78 * 



JO. 



CUD 

( 9fcmax 1 



Processor P2 



£2_ 



GD 



A2 



( 20-29 3 




i 60-70 )t 79-89 > 



32- 



c 



JUL 



( mirv19 1 



X2_ 



GD 

( 30-39 ) 



J22. 



Gs.S6) 
(50-59 ) 



-EL 



GD 

(71-78 ' 



JJ2_ 



GDJ 

( 90-max ) 



Processor P3 



GD 

( 20-29 I 




(60-70 H 79-89 ) 



( mirv19 1 " 



C3 



GD 

1 30-39 ) 



D3 



GD 

( 50-59 ) 



E3 



GD 

(71-78 1 



CGD) 

( 90-max 1 



Figure 51 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 52 of 68 



6,138,123 



Processor PI 



At 




GD 

( 20-29 ) 



1 60-70 U 7M9 I 



B1 ^ 


. C1 \ , 


D1 , / 


El / 


C ii ) 




( 30 ) 




( S0 ) 






( min-19 ) 




(30-39 » 




1 50-59 1 




t 71-76 ) 



J1L 



CUD 

( 90-max I 



Processor P2 



Fig J 
52 



.A2_ 



t 20-29 > 




.'65>« 



-o © 

( 60-70 H 79-89 1 



J2_ 



CUD 

< min-19 ) 



C2 \ 






/ E? / 


( M ) 








( 75 ) 


f 30-39 J 




' 50-59 J 




« 71-78 J 



JJ 2- 
' 90-max ) 



Processor P3 



GD 

(40-49 J 



[20-29 J 



,'70*. 



12- 



O GD 

( 60-70 )( 79-89 ) 



_B1_ 



CUD 

( miM9 ) 



C3 \ 








E3 / 

CiJ 


( 30-39 J 




t 50-59 1 




(71-78 ) 



CUD 

( 90-max ) 



Figure 52 



04/01/2003, 



EAST Version: 



1.03 .0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 53 of 68 



6,138,123 



Processor PI 



GD 
(4W9 ) 



GD 

( 20-29 ) 



(60 >t 



I 60-70 H 79-89 > 



CHZ) 

( min-19 1 



C1 \ 


D1 S 


Pi / 




\ Ml 


GD 

(30-39 ) 




(so) 




( 71 ) 




C 


99 ) 




t 50-59 ) 




t 71-78 ) 




( 


90-max ) 



Processor P2 



(24.25) 
I 20-29 1 




Ga) © 

( 60-70 " 79-89 > 



B2 ^ 


C? \ 


D2 


/ 12 / 


C » ) 




( ) 








( 75 ) 


< min-19 1 




1 30-39 > 




( 50-59 ) 




(71-78 ) 



( 90-max ) 



Processor P3 



GD 

( 20-29 ) 




GD © 

( 60-70 H 79-89 ) 



C3 



-D2L 



_E3_ 



GD 

( mirv19 ) * 



GD 

( 30-39 > 



GD 

( 50-59 1 



GD 

(71-78 1 



-H 3- 
( 90-max 1 



Figure 53 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 54 of 68 



6,138,123 



Processor PI 



_AL 



GD 

( 20-29 1 




17^89 ) 



B1 

( i 0 ; 




HI \ 

GD 




1 min-19 ) 




( 30-39 » 





JQL 



Q.GD 

< 50-59 X 60-78 > 



GD 

( 9(Kmax ) 



Processor P2 



G2. 



GD 

(40-49 1 



Fig J 
54 



12- 



G4,2s) 
( 20-29 J 



£2_ 



GD 

(7949 ) 



,B2 ^ 




C2 \ 


( » ) 








1 min-19 > 


► 


( 30-39 J 





J12_ 



GDGID 

t 50-59 H 60-78 > ' 



( 90-max J 



Processor P3 



GD 

( 20-29 ) 




GD 

(79-89 1 



J&1_ 



GZJ 

( min-19 3 



GD 

I 30-39 > 



GDCGD) 

I 50-59 H 60-78 > 



JQ- 

GD) 

( 90-max ) 



Figure 54 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 55 of 68 6,138,123 



Disk 1 



-&1- 



( 20 ) ( 40 ) 
(20-29 1 < 40-49 ) 



B1 S 




./ ' 


( 10 ) 




C7T) 




(min.10-19 > 




130-39 1 






c 



so 



D c 



( so-59 ) I 60-7B,max I 



55 



Disk 2 



(24,25) C 45 ) 
( 20-29 > U049 > 







./ 


0? 




c l5 > 




( 3 « ) 




C 55.5$ 3 


C 70 ) 


t min.10-19 1 




130-39 > 




i 50*59 J 


t 60-78,rrvu J 



Disk 3 




C is > 




( » ) 




t min.10-19 I 




< 30-39 ) 





C 58 3 04,75.7a) 
t 50-59 J J 6078.max > 



Figure 55 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct 24, 2000 Sheet 56 of 68 6,138,123 



Disk. 1 



.AL 



( 20 ) ( 40 ) 
< 20-29 1 ) 








(ID 




(min. 10-19 > 


► 


(30-39 > 






€0 



( 50-59 ) < 60-78,max * 



Fig. 
56 



Disk 2 



j&2_ 



( 20-29 1 UCM9 ) 



C 15 ) 




( 3 < ) 




(min.10-19 ) 




( 30-39 ) 





_D2 



C 55.5fi ) C 70. 71 ) 
t 50-59 > t SQ.7e.max > 



Disk 3 



GD C 



(20-29 1 (4049 ) 



C u ) 




( » ) 




(min.10-19 » 




1 30.39 ) 





03 X 

C SB ) Gi ,75,7b ) 

( 50-59 1 ( 60-78,max 1 » 



Figure 56 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 57 of 68 



6,138,123 



Disk 1 



( 20 ) ( 40 ) 
f 20-29 > < 40-49 > 



fll / / ^ ////// ^' 


./ 




C u ) 




( » ) 




G..».») C " ) 


< min.10-19 * 




(30-39 1 




< 50-59 * t 60-78 ( mw 3 



51 




57 



Disk 2 



(iLJLs) ( 45 ) 
( 20-29 5 140-49 ) 



c » ; 




C7D 




< min,10-19 » 




( 30-39 ) 





U6 7> 

dZHD' C 70,71 ) 
C 50-59 1 C 60-78,max 1 



Disk 3 



-A2L 



( 28 ) ( 46 



( 20-29 1 UO-49 J 



B3 ^^^^ 




D3 


( 18 ) 




( ?» ) 




C 58,3? ) (74.75.7a) 


( min, 10-19 > 




t 30-39 1 




( 50-59 J < 60-78,max ) 



Figure 57 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 58 of 68 



6,138,123 



Disk 1 



£L 



c 




(min.10-19 > 



Disk 2 



58 



GjLJtD (..45 ) 
( 20-29 J U&49 ) 





./ 


c is ; 




GD 




fmin,1<M9 » 




( 30-39 ) 





C 55.56 ) ClQ t ll 

C 50-59 * t 60-78,max 1 



Disk 3 



GD C 



(20-29 1 (4M9 I 



R3 >^ 


./ 


( U ) 




GD 




( min.10-19 ' 




(30-39 > 





C 58,59 3 C74i75i7ft) 

( 50-59 I I 60-78.max ) 



Figure 58 



04/01/2003, EAST Version: 1,03.0002 



1 _ i 



U.S. Patent 



Oct. 24, 2000 



Sheet 59 of 68 



6,138,123 



(sn.si.sa.sa) 
I 50-59 > 



, (S5.56) 
C 50-59 J 



I 50-59 J 



(50-59) < 



50 
51 
52 
53 
55 
56 
58 
59 



I (50-54) 



(55-59) 







D1 








1 50-54 J 


. 0.0-0.34 




n? 




0 




( S2 ) 

( 50-54 J 


' 0.35-0.66 


0 0 




Q2 




0 0 
0 0 

0* 0* 

0 0 _ 

50>' X 
51' V 


0* 

y 


C » ) 

1 50-54 J 


' 0.67-1.00 


52'' 
53-"" 




F1 




58 




I 55-59 1 


„ 0.0-0.34 


59 x "***-•- 












( » ) 

I 55-59 > 


0.35-0.66 

► 






F3 






% 


( » ) 

1 55-59 ) 


0.67-1.00 



Figure 59 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 60 of 68 6,138,123 



Disk 1 




( 10 ) 




( 30 ) 




(mtn.10-19 J 




( 30-39 I 





C SO. SI )( SS.S6 )C 60 ) 
( 50-54 ) ( 55-59 ) (60-78,max ) 



Disk 2 



60 



.A2_ 



(20-29 ) U(M9 ) 




c i 5 ; 




( 31 ) 




C 52 X 58 X 70.71 ) 


t min,10-19 5 




( 30-39 J 




I 50-54 1 t 55.59 J ( 60-78,max ) 



Disk 3 



( 2B ) ( 48 ) 
( 20-29 1 ( 40-49 1 



C 18 ) 




( » ) 




tmin.10-19 > 




( 30-39 ) 






50-54 



4.75.7 



a) 



55-59 ) (60-78.max 3 



Figure 60 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 61 of 68 



6,138,123 



Disk 1 



( 20 )C*o)(sS.S€) 
I 20-29 J I 4<M9 1 I 55-59 J 




Disk 2 



61 



^2. 



(24.2S)C 45 )( SB ) 
t 20-29 ) I 4<W9 » ( 55-59 J 




<min.10-19 * 



( 30-39 ) * 



« 50-54 I 




Disk 3 





83 



lmin.10-19 J 



C 74.7S.78 J 
( 60-78,max ) 



Figure $x 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 62 of 68 



6,138,123 



Disk 1 



GD 

(4<M9 * 



.AL 



120-29 ) 



GO 

(min.10-19 > 



CI \ 

( 30 ) 




( 30-39 J 





-QL 



(so. si) 

( 50-54 ) 



' 55-59 1 V 

77\ 



( 60>76,mai ) 



Fi9 J 
62 



Disk 2 



£2- 



U(M9 I 



-A2L 



( 20-29 > 



t 55-59 ) 



32_ 



c 



<minJ(M9 > * 



C2 N 
GiJ 




1 30-39 » 





J22- 



I 50-54 J 



JEL. 



C 70.71 ) 

i 60.78 t max I 



Disk 3 



£3- 



< 40-49 ) 



( 20-29 ) 



GD 

t 55-59 1 



B3 



CUD 

(min.10-19 ) 







( H ) 


( 30-39 1 




(50-54 1 



174.75. 7tt> 
( 60-78,max > 



Figure 62 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 63 of 68 6,138,123 



Range 
Addition 
Rules (Split) 



Range Removal 
Rules (Merge, 
Delete) 



Range 

Adjustment 
Rules 









Range 

Determination 
Rules 


> 

D.l 


f 


A. 3 

1 11 < 


Breadth 

Adjustment 

Rules 









G.2 



Adjustment 
Need Rules 
(Rules for 
Fullness) 



Arranging 
Rules 
(Ordering 
Scheme) 



G.l 



G.6 



G.3 



A A ± 



Global Node 
(G-node) 



Value Range 
(G-node 
Range) 



G.5 



Parallel Node 
(P-node) 



P.l 



Data Value 

Storage 

(elements) 



Figure 63 



R.i 



Range 
Relation 
Rules [RO) 



G.4 



.A 



R.2 



Logical 
Relationship 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 64 of 68 



6,138,123 



c 



START 



/ Inout x / 



Locate global node Ipjo 
where R(loj1)<x<R(loj2) 




Locate individual node 
Ikjo that may contain x 




Process x from lkjo> insert 
or remove as required 



Adjust Ranges 




Add global node 



Remove global 
node 




Adjust ranges of 
new node and 
adjacent nodes 



Adjust ranges of 
remaining nodes 



Figure 64 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 65 of 68 



6,138,123 



Fig J 

65 



Processor 1 



C 



1S1.163 



3 



f «50470.1*#.**1-##3 



/ 



Processor 4 



C 



1SS.1S6 J 
|*50*70^,»-if*6 J 



/ 



Processor 7 



z 



c 



|>50-# 



5 

:70.1«#,«7-mJ 



Processor 2 



z 



c 



2S2.262 



[I!504f70.2».l81-i»3 



J) 



/ 



Processors 



c 



5 



Processors 



Z 



C 



267.269 



Processors 



9 



c 



3 



351.362 
#50470.3^,«#1-«#3 j 



/ 



Processor 6 




/ 




C 355.364 } 






^50^70.35»,##4-#i!f6 j 




/ 



Processor 9 



z 



C 357. 3S8 ^) 
^S50-#70,3SJ!,m-#S9 J 



Figure 65 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent Oct. 24, 2000 Sheet 66 of 68 



6,138,123 








7 






Terminal 








127 










s 





User 127 



Figure 66 



04/01/2003, EAST Version: 1,03.0002 



U.S. Patent Oct. 24, 2000 Sheet 67 of 68 6,138,123 




SERVER 



Disk 1 



Disk 2 



Disk 3 



Figure 67 



04/01/2003, EAST Version: 1.03.0002 



U.S. Patent 



Oct. 24, 2000 



Sheet 68 of 68 



6,138,123 



Composite Global Node 
Based on Range (a-d) 




Local Index 1 
(B-tree) 

Local Link Based 
on Difference in 
Ranges (q-s),(t>z) 

Implicit Global Link 
Based on Common 
Range (t-z) 



Local Index 2 
(B-tree) 



Local Index 3 
(B-tree) 



FIG. 68 



04/01/2003, EAST Version: 1.03.0002 



6,i: 

1 

METHOD FOR CREATING AND USING 
PARALLEL DATA STRUCTURES 

CROSS-REFERENCE TO RELATED 
APPLICATIONS 

This application claims priority under 35 U.S.C § 119(c) 
to the following related provisional applications: Ser. No. 
60/023,340, filed Jul. 25, 1996 and Ser. No. 60/022,616, 
filed Jul. 26, 1996. 

BACKGROUND OF PROBLEM AND 
SOLUTION 

In recent years, the need for more computational power 
and speed in computer systems has lead to the use of 
multiple processors to perform computational tasks. The 
processors work cooperatively, sharing resources and dis- 
tributing work amongst themselves by sending data over 
communication lines or through shared memory. This prac- 
tice of utilizing multi-processors to accomplish a single task 
is known as parallel processing or distributed processing. 
Although the terms parallel and distributed may describe 
distinct forms of multi-processing, they are in essence 
synonymous. The problems described and solved by the 
present invention apply equally to parallel and distributed 
processing. In addition, these problems/solutions apply to 
any component or aspect of a computer system to which 
work may be distributed amongst multiple components, 
even in non-parallel systems: one such "non-parallel" appli- 
cation described herein is the use of the present invention to 
manage dynamic access storage devices (DASD) such as 
disk drives [see section Rules for Fullness and Ordering 
Scheme for B-trees Stored on Disk]. 

Dividing work amongst multi-processors in such a way 
that the work is divided evenly and performed in an efficient 
manner is the goal of parallel/distributed processing, and 
dividing work amongst multiple system components equally 
and efficiently is also desirable in sequential (single - 
processor) systems. Many well-known sequential methods, 
systems or processes exist to efficiently perform computa- 
tional tasks (sorting, merging, etc.). Their parallel counter- 
parts have yet to be invented. Some new parallel methods 
are parallelized versions of existing sequential methods, for 
example, the parallel recursive merge -sort [see "Introduc- 
tion to Parallel Methods" by Joseph JaJa, Addison -Wesley, 
1992]. The invention described herein is a method of cre- 
ating parallel data-structures. The preferred embodiment of 
the present invention parallelizes single-processor, ordered 
list methods to efficiently distribute the work and storage for 
ordered list maintenance amongst multiple processors, pro- 
cessing components and/or storage locations: this ordered 
list maintenance is carried out through adapted versions of 
single-processor data-structures expressed as graphs 
(B-trees, AVL trees, linked-lists, m-way trees, heaps, etc.). 

DISCUSSION OF PROBLEM AND PREFERRED 
EMBODIMENT 

The goal in parallel processing is to utilize a number of 
processors (P) to increase the system *s speed and power by 
a factor of P: optimally, a task requiring time T on a 
single-processor can be accomplished in time T/P on P 
processors. The problem is the even distribution of work 
amongst the P processors. Many new methods have arisen 
from the field to efficiently distribute the work for standard 
computational tasks (e.g. sorting, merging, etc.). One stan- 
dard task is the maintenance of ordered lists of data: many 
methods exist for single-processor systems to accomplish 



18,123 

2 

ordered list maintenance; the problem described and solved 
herein is the efficient distribution of work amongst multi- 
processors to accomplish efficient ordered list maintenance. 
(The term "ordered list" includes many data -structures: 

5 sorted lists, heaps, stacks, trees, etc.). 

In general (regardless of the type of system keeping the 
lists), the maintenance of ordered lists consists of two basic 
operations: Insert() and Remove() (Search()/Find0 is 
implied.). Insertion into the lists requires that an element of 

10 data be added to the list and that its position within the list 
be defined. Assuming the ordered list {5,12,46,67,80,99}, 
the Insertion (Insert(x)) of the numeric element 35 (Insert 
(35)) results in the list {5,12,35,46,67,80,99}. Removal 
(Remove(x)) of an element can take several forms: removal 
by location, by value, by range of values, etc. Again, 

15 assuming the list {5,12,46,67,80,99}, Removal of the fourth 
(4th) element results in {5,12,46,80,99}, Removal of the 
value 12 results in {5,46,67,80,99}, Removal of the smallest 
element greater than 50 results in {5,12,46,80,99}. The 
Remove operation is considered to return the value of the 

20 removed element for use, if present, or to return the infor- 
mation that a specific value is not contained in the list, if not 
present. 

The problem presented and solved in the preferred 
embodiment is the parallelizing of the list maintenance 
described above. The essential functioning of the list 
remains the same in the parallel version of the data-structure. 
The Insert(x) and Remove(x) operations produce the same 
results. However, on a single-processor system these opera- 
3Q tions are performed by one processor which can only Insert 
or Remove one element at a time; on a multi-processor 
system with P processors, the parallel version of the method 
can Insert and/or Remove P elements at a time as described 
below. 

3S Assuming a multi-processor system with 3 processors 
(P»3), and also assuming a list containing the elements 
{4,13,14,20,28,34,39,43,53,67,76,81} we have the follow- 
ing parallelized result: each processor keeps approximately 
one-third of the elements at any given time; each processor 

^ may Insert(x) into its own sub-list at any given time 
(possibly sending the element x to one of the other proces- 
sors for Insertion into one of the other sub-lists); each 
processor may Remove(x) from its sub -list at any time and 
may request that other processors attempt to locate element 

45 x in their sub -lists if x is not present in the original 
processor's sub-list; any other processor finding x in its 
sub -list then sends x to the original processor. 

The sub-lists are distributed in this example by cutting the 
list into equal thirds. This manner of distribution is for the 

50 purpose of a generalized example only. The Example given 
in this section is intended to introduce the reader to the 
problem in the most generalized manner possible; the 
Example here contains none of the specific details of the 
parallel method. 

ss The Parallel List 

Processor #1 (PI) keeps one-third of the elements: 

Sub-list Sl-{4,13,14,20} 
Processor #2 (P2) keeps one-third of the elements: 
Sub-list S2={28,34,39,43} 

eo Processor #3 (P3) keeps one-third of the elements: 
Sub-list S3={53,67,76,81} 
Insertion into the Parallel List 

The elements 72, 22, and 12 are to be Inserted. All three 
processors simultaneously perform the Insertion giving the 
65 results: 

Processor #1 (PI): Insert(72) — sends element 72 to P3, 
receives element 12 from P3 



04/01/2003, EAST Version: 1.03.0002 



6,138, 

3 

Sub-list Sl={4,12,13,14,20} 
Processor #2 (P2): I nsert(22)— inserts element 22 directly 
into its sub-list (S2) 

Sub-list S2-{22,28,34,39,43} 
Processor #3 (P3): Insert(12) — sends element 12 to PI, 5 
receives element 72 from PI 

Sub-list S3-{53,67,72,76,81} 
Removal from the Parallel List (List Contains Elements 
from Insertion Above) 

The values 37, 28 and 13 are to be found and Removed. 10 
All three processors simultaneously perform the Removal, 
giving the results: 

Processor #1 (PI): Remove(37) — requests another processor 
to find 37 and receives reply from P2 that the element 37 is 
not present, receives request for element 13 from P3 and 35 
Removes 13 from the list. 

Sub-list Sl={4,12,14,20} 
Processor #2 (P2): Remove(28) — removes 28 directly from 
the list, replies "37 not present" to PI 

Sub-list S2-{22,34,39,43} 20 
Processor #3 (P3): Remove(13) — requests another processor 
to find 13, receives 13 from PI 

Sub-list S3={53,67,72,76,81} 

It must be stressed that the example above is a generalized 25 
example intended to explain the basic logical functionality 
of the problem. The precise details and organization of 
parallelized lists are described in subsequent sections. 

The essential functioning of an ordered list is described 
above; however, many different forms of lists are used on 30 
modern systems, and many different types of data may be 
stored. Efficient methods/data-structures are used to main- 
tain such lists on single-processor systems: heaps, binary 
trees, AVL trees, B-trees, etc which are well known in the 
art. (For descriptions of such methods/data-structures see 35 
"File Structures Using Pascal" by Nancy Miller, The 
Benjamin/Cummings Publishing Co., Inc. (1987)). The 
methods used on modern systems were designed to function 
on single-processor systems efficiently. This efficiency is 
expressed by asymptotical time-complexity functions. The 40 
functions are generally expressed in terms of n in the form 
0(f(n)) [e.g. 0(log 2 n) or 0(n 2 )]. For the problem to be truly 
solved, a parallel version of a list maintenance method must 
distribute the work amongst the P processors efficiently so 
that the time -complexity approaches optimum improvement 45 
(speedup). Perfect speedup for a given parallelized method 
would be 0(f(n)/P). 

SUMMARY 

The present invention is a means to create parallel data- 50 
structures and associated maintenance programs. The data- 
stnictures and programs may take a variety of forms, all 
using the same essential steps and components. The parallel 
data-structures distribute a given data set to system compo- 
nents by grouping the data set according to ranges. These 55 
ranges are sub -divided for distribution into parallel form. A 
^ given data value is located by its placement within I n" 
a ppropriate range; the ranges are located by their relatio n- 
ships to each other and the data set as a whole; thus, as t he 
ranges are related to each otberTtne order of the data seL is eo 
maintained and access may b e gained to the da ta set bv , 
r a nge, and as the data values are related -to theja n^ esiih^ 
d ata values_ themselves may be maintained as well 

In order for a data set to change, the values or the 
relationships between the values must change. The present 65 
invention allows this change by altering the ranges or the 
relationships between the ranges and thereby altering the 



123 

4 

values or relationships between values. Altering a range may 
alter the sub -set of data contained by the range, and this 
range alteration may then be used to re -distribute data values 
and maintain appropriate sizes and locations for the data 
sub -sets. The maintenance of the ranges, sub-sets and data 
value distribution within the sub-sets offers a wide variety of 
possible over- all distributions of data sets and methods of 
maintaining order. Some of these distributions and methods 
are parallel forms of serial data-structures. 

The present invention offers many advantages including: 
a flexible means to create a wide variety of parallel data- 
structures rather than simply defining a single instance of a 
particular parallel data-structure; flexible methods of dis- 
tributing data within a structure for efficiency; the ability to 
create parallel versions of serial data-structures that maintain 
the essential efficiency and express the essential form of the 
serial data structures without significant alteration of the 
principles or methods that underlie the serial data -structures. 

OBJECTS AND ADVANTAGES 

One object of the method of creating data -structures is to 
distribute work and storage to multiple system components. 
The method can accomplish the distribution of work by 
allowing simultaneous access to multiple parallel nodes, 
graphs or indexes by multiple processing elements in a 
flexible manner. It can accomplish the distribution of storage 
by distributing multiple parallel nodes to multiple storage 
locations. 

Another object is to provide the ability to distribute data 
more evenly. A data set with a skewed distribution may be 
more evenly distributed by breaking the data into sub-sets. 
Each sub-set may be distributed evenly while all of the 
sub -sets taken together still express the original distribution 
of the data set. 

An advantage of the method when used to transform serial 
data-structures into parallel form is that the original structure 
of the serial algorithm can be expressed without altering the 
essence of the algorithm. 

Another advantage is the wide range of possible structures 
created. Many serial data-structures may be adapted using 
the same principles" as well as many new parallel data- 
structures created. 

Another advantage is the use of various components of 
the method to refine the functioning, data distribution, work 
distribution and efficiency of the data-structures and asso- 
ciated maintenance programs through the characteristics of 
the rules that support the various components. For only one 
example, see the Rules for Fullness and Ordering Scheme 
for B-trees Stored on Disk section contained herein. 

Still other objects and advantages will become apparent 
through a consideration of the other descriptions of the 
invention contained herein. 

BRIEF DESCRIPTION OF FIGURES 
FIG. 1 shows serial b-tree. 

FIG. 2 shows parallel b-tree on two processors with 
indication of G-node and P-nodes for preferred embodiment. 

FIG. 3 shows parallel b-tree of FIG. 2 after removal of one 
G-node. 

FIG. 4 shows serial AVL tree of Example 1 for preferred 
embodiment. 

FIG. 5 AVL tree of FIG. 4 after addition of element. 

FIG. 6 AVL tree of FIG. 4 after rotation. 

FIG. 7 AVL tree of FIG. 4 after another addition. 



04/01/2003, EAST Version: 1-03.0002 



6,138,123 

5 6 

FIG. 8 AVL tree of FIG. 4 after another addition. FIG. 46 shows redistribution of elements to maintain 

FIG. 9 AVL tree of FIG. 4 after rotation. Ordering Scheme. 

FIG. 10 AVL tree of FIG. 4 after removal of element. FIG, 47 B-tree of FIG. 38 after insertion of G-node. 

FIG. 11 parallel AVL tree of Example 1 for preferred 5 FIG. 48 B-tree of FIG. 38 after another addition of 

embodiment, comprising 3 separate trees stored on 3 pro- elements. 

cessors. FIG. 49 shows redistribution of elements to maintain 

FIG. 12 AVL tree of FIG. 11 after addition of element. Ordering Scheme. 

FIG. 13 AVL tree of FIG. U after another addition. FIG. 50 B-tree of FIG. 38 after insertion of G-node. 

FIG. 14 shows redistribution of elements to maintain io FIG, 51 B-tree of FIG. 38 after b-tree node split. 

Ordering Scheme. FIG. 52 shows removal of elements from tree of FIG. 38. 

FIG. 15 shows range split and redistribution of elements FIG. 53 shows another removal of element from tree of 

resulting in creation of new G-node (G-node Split) for FIG. 38. 

Examples 1 and 2 of preferred embodiment. piG. 54 shows result of G-node removal from tree of FIG. 

FIG. 16 AVL tree of FIG. 11 after insertion of G-node. 38. 

FIG. 17 AVL tree of FIG. 11 after rotation. FIG. 55 parallel B-tree stored on three disks for Example 

FIG. 18 AVL tree of FIG. 11 after another addition of of B-trees Stored on Disk section, 

elements. FIG. 56 B-tree of FIG. 55 after element addition. 

FIG. 19 shows redistribution of elements to maintain 20 FIG. 57 shows redistribution of elements to maintain 

Ordering Scheme. Ordering Scheme. 

FIG. 20 shows range split and redistribution of elements FIG. 58 B-tree of FIG. 55 after another addition of 

resulting in creation of new G-node (G-node Split) for elements. 

Examples 1 and 2 of preferred embodiment. FIG. 59 shows range split and redistribution of elements 

FIG. 21 AVL tree of FIG. 11 after insertion of G-node. resulting in creation of new G-node (G-node Split) for 

FIG. 22 AVL tree of FIG. 11 after another addition of B-tree of FIG. 55. 

elements. FIG. 60 B-tree of FIG. 55 after insertion of G-node. 

FIG. 23 shows redistribution of "elements to maintain FIG. 61 B-tree of FIG. 55 after b-tree node split. 

Ordering Scheme. 30 p[G 62 R _ tree of piG 55 after additional b . tree node 

FIG. 24 shows range split and redistribution of elements sp ^ t 

resulting in creation of new G-node (G-node Split) for , i Jir r j • * r 

„ ° . , . _ f , u ♦ FIG. 63 data model for a preferred instance of present 

Examples 1 and 2 of preferred embodiment. . r r 

invention. 

FIG. 64 flow chart for a preferred instance of present 
invention. 

FIG. 65 shows nine P-nodes related by complex G-node 
Range. 

e n „ FIG. 66 diagram of hypercube network with terminals and 

FIG.29 shows resultofG-noderemovalfromtreeofFlG. ^ disk storage for Example of Application 1. 

. . t n „ ^ , « , , FIG. 67 diagram of distributed network showing three 

FIG. 30 shows serial B-tree of Example 2 for preferred cHem terminals? one XTVKt and three disk . packs for 

embodiment, comprising 3 separate trees stored on 3 pro- Examp i e 0 f Application 2. 
cessors, 

FIG. 31 B-tree of FIG. 30 after addition of element. 45 
FIG. 32 B-tree of FIG. 30 after b-tree node split. 

FIG. 33 B-tree of FIG. 30 after additional b-tree node PREFERRED EMBODIMENT 

S plit Introduction 

FIG. 34 B-tree of FIG. 30 after another addition. T^e preferred embodiment of present invention relates to 
ie r. * rriP it\ * *u jj V 50 a process of creating parallel data-structures which adapts 
FIG. 35 B-tree of FIG. 30 after another addition. sequential data-structures and their associated processing 
FIG. 36 B-tree of FIG. 30 after b-tree node split. programs for use in parallel or distributed environments. The 
FIG. 37 B-tree of FIG. 30 after removal of element and invention achieves this by creating parallel data-structures 
b-tree node merge. that are identical in form and function to the sequential 
FIG. 38 parallel B-tree of Example 2 for preferred 55 data-structures in a parallel environment. The adapted par- 
embodiment, allel data-structures and methods can be used in the same 
FIG. 39 B-tree of FIG. 38 after addition of element. way as their sequential counterparts but in a parallel envi- 
F1G. 40 B-tree of FIG. 38 after another addition. ronment. The sequential data-structures which are adapted 
FIG. 41 shows redistribution of elements to maintain must , h * ve configurations determined by the orderable quali- 
Orderine Scheme 60 ties contained in the data-structures. The sequeji; 
_ ^ n ' r r _ tial data-structures and their associated maintenance pr o- 
FIG. 42 B-tree of FIG. 38 after insertion of G-node. grams gen erally have three ^ function s in common: FindQ , 
FIG. 43 B-tree of FIG. 38 after b-tree node split. InsertQ a nd Removed functlol^ " 

FIG. 44 B-tree of FIG. 38 after additional b-tree node -*T?fiI 58Hepicting applicant's parallel indexing method, 

split. 65 FIG. 68 depicts composite Global Index comprising three 

FIG. 45 B-tree of FIG. 38 after another addition of local indexes; one of the five composite global nodes is 

elements. circled and labeled (a-d); figure shows common structure 



Examples 1 and 2 of preferred embodiment. 

FIG. 25 AVL tree of FIG. 11 after insertion of G-node. 

FIG. 26 AVL tree of FIG. U after rotation. 35 

FIG. 27 shows removal of elements from tree of FIG. 11. 

FIG. 28 is shows another removal of elements from tree 
of FIG. 11. 



11 



FIG. 68 is a block diagram illustrating the principles of 
the invention. 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 

7 8 

and access methods for all indexes, common ranges on all For parallel inserts and removes, a given processor simply 

indexes, local links based on range difference, global links locates the position v in th eaaTa^structure either by posi tion 

based on range commonality, and storage of one value "v" o r value. Multip le processors may then cooperate to se arch 

on index 2 in Range (t-z). A qu e ry on the value "V J Lcan for a desired data element y within pos ition v. The proce s- 

origi nate on any index 1. 2 or 3 equivalently. AssumingJt he 5 s ors search through the 1 S nS(2P~l) elements at the pos i- 

query starts on inde x 1. the request may be shunted imme- ti ons v at all processors i (1 ^i^P) in parallel, it the remov al 

diately to eithe r index 2 or 3 if the processor for inde^J , is of art element y a t positio n v leave s position v sufricien tlv 

busy: indexes 2 and 3 could a|*n pa** T ntrn1 tr> any nthe r e mgty^ t hen each processor re-orders its data -structu re 

fo cal index, l'h e query tra vels d o wnth e ri ghtmos t lo cal link according to the missing position v c orrespondipg_jo the 

on the controlling processorflocating the rag g e (t-z) _as the 10 element y in the same way thai the sequential method 1 w ould, 

r ange to hold v: in , the preferred embodiment, index 2 is If the add ition ot an elem ejU__XJ£0^'irc* mor e nodes to 

i mmediately calculated as the specific index to bold V wi thin a3n!aift4£^affte^ re -orders its 

t He range (i-t) bv virtue of its being the center of rangej ft-z) d at a -structure a ceo rding to lan ad dl tionaT^Stibirw^co ir e - 

If index 2 does not already have confrp] , the query, the n spoBrMhg to the element y in th e same way that th e sequcn- 

trayers es ih e rightmost global lin k to index 2 at range (t-z) is tlal m ethod would' Z 
an d index 2 accesses the range and * hereby the, JaW f within 

f vjTThis proc e^ produces *i U**t tWn rKfWn f p n |hs tn v Preferred Embodiment— Uses of Data-Structures 

c frjwn dynamically at q nrry timr; if thr, query st arted-on j^q U ses of the adapted parallel versions of the data- 

lgcal index 2, .frea-it-requuis 2 actmcs ( 1 at ihcToot plus structures and maintenance programs are the same as the 

ratihe4e e«l uudi fy-i), if it started on c ither local index 1 20 uses 0 f me i r sequential counter-parts, only in a parallel 

o'r^JhgiulJeq uirog 3 acce sses ( 2 for the local index plus environment. The speedup of the parallelization brought 

.-J *t ig dej f ^ ~ about by the present method is very efficient and justifies its 

Tnthe parallel data-structures, each processor may contain design, 
multiple elements at any position v within the structure. The 

number of elements contained at position v is determined by 25 Preferred Embodiment 

the Rule for Fullness and Ordering Scheme for the given Definitions 

parallel data^structure. The simplest Rule and Scheme allow Many terms must be defined to adequately describe the 

zero (0) through two (2) elements per processor to be Process of Adaptation. 

contained at any position v. Such a simple Rule and Scheme At Will (Implies Blind) — an activity that a processor may 

are assumed for the introduction and any other section of this 30 perform at any time regardless of the activities of other 

application unless otherwise stated. processors; 

In the InsertO function, for sequential data-structures, the Blind (Blindly) — Activities performed by a processor or 

insertion of an element y [Insert(y)] results in the placement set of processors with no cooperation from other processors; 

of the element y in the sequential data-structure at some Cooperative (cooperatively) — activities performed by a 

position V. The position v is determined the element y's 35 set of processors that require communication and/or coor- 

orderable or ordinal relationship to the other elements and dination between processors; t 1 <T/ \ 

the positions of the other elements in the data-structure. The Data-structure — an organization or method of organiza- V / 8 ^/ 

position v is determined by the "rule for insert" for the given Y ion for data. Frererg bly^Jhe data-stjgietur es a re based (form \ 

sequential data-structure. Position v is also determined by or configura tion an ?tunctioni ng) on preferable data; e. g. 

the rule for insert in the parallel data-structure: each pro- 40 ^ap^^-trees^Dinar y search trees , ^tc.; 

cessing element creates a configuration for the data-structure ^~*DeHned Tj-hocle: See G-node ~ " 

identical to the configurations at all other processing ele- Element — a single data-value within a data-structure, 

ments. Each element may be of any type (e.g. integer, real, char, 

Using the Rule for Fullness and Ordering Scheme men- string, enumerated, pointer, record, or other). The elements 

tioned above for parallel data-structures, each processor may 45 must all relate to each other in some orderable fashion; 

contain as many as two elements y 19 and y 2 at position v. Element Deletion — Removal of an element from a 

Consequently with P identical data-structures, one at each G-node; 

processor, there exist in total l^n^(2P-l) elements y,y Element Addition — Insertion of an element to a G-node; 

(l=i=P) (l=j = 2) at all positions v, taken cumulatively. Explicit G-node Range — see G-node Range 

Any processor i (1 ^i^P) may insert any element y into the so Global — all of the processors; 

parallel data-structure, and the element y will be placed at G-node (Global Node) — a set of P(number of processors) 

position v in one of the data-structures held by one of the P-nodes. Each G-node contains 0<n<(xP) elements (x=Max 

processors. Although this may result in different configura- number of elements in each P-node). In the preferred 

tions for the sequential and parallel versions of the embodiment, each P-node in a G-node occupies the same 

structures, the essential relationships between the data ele- 55 position in each per-processor data-structure. Each P-node 

ments in the data-structures will remain the same for both in a G-node contains the G-node Range of that G-node. The 

versions of a given data -structure. G-node functions in the parallel method in the same way an 

The Remove() function for sequential data-structures has S-node functions in the corresponding sequential method, 

one of two forms. A RemoveQ according to position finds an The G-node uses the G-node Range to relate to the other 

element in a given position in the data-structure and removes 60 G-nodes. G-nodes are created simultaneously with the 

the element. A RemoveQ according to value, searches the P-nodes which are contained in the G-node. 

data-structure for a given value of y and removes the G-nodes have the following properties: each has a G-node 

element. In both cases, the data-structure may be re -ordered Range; all the G-nodes in a parallel data-structure may 

to compensate for the absence of the removed element. In an become full or empty or partially empty; when a G-node 

adapted parallel version of a given data -structure, any or all 65 becomes full, it is Split; when a G-node becomes sufficiently 

of the processors may execute a Remove(y) function appro- empty, it is deleted. The determination of when a G-node is 

priate to the sequential data-structure with the same result. full or sufficiently empty depends on the Rule for Fullness 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 

9 10 

for the G-node. Each G-node is composed of P sets of data Partially Defined G-aode: See G-node 

elements within the G-node Range; each of the P sets may Partially Undefined G-node: See G-node 

contain from 0 to X elements. A Defined G-node is a G-node Per-processor — sequential activities on a processor or on 

with a fully defined G-node Range. An Undefined G-node a set of processors: this term conceptually divides a parallel 

has a G-node Range with one or more boundaries left open 5 enlitv or process into its sequential parts and refers to each 

or undefined processor's activity separately; 

G-node removal^deletion of a G-node from the data- P " Dode (Parallel Node) an adaptation of an S-node. A 

structure; this effectively removes old G-node Ranges from £- node T co " 0 to » clcmeDt f which fall into its G-node 

the data -structure* Range. In addition, a P-node relates to the other P-nodes in 

« j AJJV e j e *u j * the data-structure not only by the value of the elements 

G-node insertion-Addition of a G-node from the data- 10 ^ ^ p _ node ^ ^ ^ G _ node 

structure; this effectively adds new G-node Ranges to the Range wfaich fc contained in each p_ node and determined by 

data-structure; the G . node t0 wmch the p. node belongs. Each P-node in a 

G-node Range— The G-node Range is the range of values data-structure is part of a G-node. When converting an 

that the G-node may contain, in the preferred instance, a set s . nodc ^ a p. nodC) extra unks are not addcd f or the extra 

of two values R(G 0 J={R( &>/1 ), R^)} that are the mini- 15 elements. The rules for relationships between P-nodes on a 

mum and maximum values of the elements which may be processor are the same as the rules for relationships between 

contained in the G-node G. The G-node Range determines the S- nodes of the sequential data -structure from which the 

the proper placement of the G-node within the parallel parallel version was derived with respect to G-node Ranges, 

data-structure and thereby determines the proper placement Except for P-nodes created at the very beginning of the 

of an element or P-node within each per-processor data- 20 process, P-nodes are generally created through the splitting 

structure; of G-nodes; 

The G-node is stored across multiple processors, but the Processor — a processing element or CPU with or without 

G-node Range uses the same range for each component of its own local memory in a parallel or distributed environ- 

the G-node on each processor. The Range is stored with the ment - The processors are all interconnected in the parallel 

G-node. The Range may be stored either explicitly or 25 machine or network by communication lines or by shared 

implicitly: explicit storage of the G-node Range is the listing memory. Also used to refer to any system component to 

of the values that define the range; implicit storage would be wh * ch work may be distributed^ 

. M a r ^ | * r JL , . • . n Z ~ Ranee Relation Function — This function R() determines 

the storage or one or more values irom which a range could , J? . n i ». ,l * ^ 

be calculated w ^" no " e R an S es relate to each other (i.e. less than, 

C G-n C c!de a Split-A G-node Split occurs when a G-node 30 8™ tet than > ^ ual t0 ' subsets of > su P ersets of each 0,her ' 

becomes full. The Splitting process divides all of the values *'* _ _ , - , , t 

* • -i • *u j ■ * * L , , * v a Range Determination Rules — these rules determine 

contained in the G-node into two roughly equal sets X and * . . 4 . At _ c , . 4 . 

v -*u j* *■ * rv * v • • 1 r « .< ranges tor the data: m the preferred instance, the range is 

Y with distinct ranges. One set X remains in the G-node, the , & , , , ,\ , , ■ 

*u i jTn./^ j based on data placement (the number, value, distribution 

other set Y is stored in a newly created G-node. The G-node _,, • . * , * * , - \ . 

Ranges of the two nodes are set according to the division of 35 and/or P os,tlons °f element f valu£S {oT u ^ h ° we J er ' 

*u * v j v tt. ^ j o i ** * >■ a c jj* ranges may also be set to force a change in the data.- 

the sets X and Y. The G-node Split is a method of adding , & J . . . & it & 

f-% -t . it t $ g~% j • • *u n i placement, or set according to other catena; VUl 

new G-nodes to the set of G-nodes comprising the parallel r „ , * w « ™_ 1 i_ i_- t_ \_ c u \ I 

■t * . t . • , , o t , tf . t Rule tor Fullness — The rule by which the fullness or V 1 

data-structure; by virtue of the G-node Range as the basis for , — : 7 — ^ . — , J , , , ■■ n 5 v — 

... ... , ..... 7. , emptiness of a G-node is determined, hull G-nodes are split; 

this process it is also a range addition method. ^ . JT ^-«- ™ 1 1 . ^ J 

t r j n r- a n ^ empty G-nodes are removed. Different Rules may be denned 

Implicit G-node Range — see G-node Range 40 *f . f 

T . . . j * . 1 *• u* * ior dirrerent data-structures, lne goal in setting the rules tor 

Link — representation and reference to the relationship of - . . 4 . £ „ r „ b , . „ & . 

ad'acent nodes- determining the fullness of G-nodes is to make the most 

x>cavwai \jr ■ ui 1 / \ efficient use of space, and processing time. The Rule for 

MAXVAL — Maximum possible value (00) „ t1 \ L ,? . , i 1 

MINVAI^-Minimum possible value (-00 J u llnsSS expresses and may be used to m a. nta.n: ra ngeor 

Ordering Scheme-TTie manner in which data elements AS tuto <=mpUness, range^breadfh (naaowness or 

are arranged within a G-node. May be ascending, br oadness?, density and distnEuUon otjata values wjthrp 

descending, partially or fully sorted, completely unordered ffi~~*p s T * _ , A _ , „ _ 

in addition to many other arrangements. Different Schemes ™ e fo D r ln ^P^ " . Rtt 6 f ° r a , nd 

may be denned for different data-structures. Schemes may Rul< \ for P° s ' tlonin g Nodes)-4he ord.nal or orderable rela- 

be denned to provide efficient access paths, efficient data so ^nsh,ps between he data elements contained in the nodes 

distribution, proper placement of an element into an appro- °J a g>v=n data-structure; especially m the lnser«0 and 

priate P-node within a G-node, or other provisions; Rem ° V6 0 ^ on& of sequential programs and data- 

Ordinable-data that has the capacity to be ordered; 5, tructu f f and u the Mm f 1 funcUons ( Wlth 10 G - node 

P-number of processors on a parallel machine or dis- Ra "8 es ) for their pualkl counterparts; 

tributed network* 5S Sequential or Serial — processes or entities performed or 

Parallel-processes or entities performed or existing on «isting on a single processor or designed to perform or exist 

multiple processing or memory storage units, designed to on a single processor, 

c • 7 i*- 1 * S-node (Sequential node) — a smgle cell within a sequen- 

perform or exist on multiple processors or system . , " \ \ H " * " ' . & . . . t 4 „ " 

. . . . * .u * 1 j ■» ir. ■ -i tial data-structure that contains a single element. Bach 

components, or having a structure that lends itself to similar 01* . & , * . 

. „ S-node relates to its adjacent nodes or to the rest of the 

distribution; 60 , J , , . , . 

Parallel Data-structure or Global Data-structure-the data-structure according to the ordinable relationships 

data-structure that results from applying this process of between the element the node contams and the elements 

adaptation to a sequential data-structure. Aparallel or Global co"""^ in the other nodes; 

data-structure is composed of a set of P sequential data- Preferred Embodiment 

structures each of which is composed of a set of P-nodes and 65 Symbols 

incident links. The P-nodes and links form precisely the Data -structures — Sets of nodes containing elements, and 

same configuration on each processor incident links. 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 

11 12 

Elements An element is a single piece of ordinable data. Subscripts — integers or variables between 1 and 9 

Elements are generally depicted in one of two ways: 1. An inclusive, or a naught (o). Subscript numbers will not exceed 

element may be thought of as having a constant value; such 9 unless otherwise stated. Naught (o)— represents an 

an element usually belongs to a set that contains members absence of specification of individual members of a set with 

with single subscripts: (w={w 1? W 2 , W 3 , . . . W„}); 2. An 5 regard to the subscript position, and, thereby, will define a 

element may also be referenced by its position in a data- sub " set - Variables with three(3) subscripts have the follow- 

structure; these are usually referenced by the subscripted cell * n § su °script order: S^: S processor ^mber. node number, ceil 

letters to which they belong: (x={x lu , x 112 , x 321 , x 122 . . . } ™*™ ber 

Elements are also frequently depicted as their actual values, Preferred Embodiment 

both in set and graphical form. 10 Preferences for Adaptable Data-Structures 

G-node Ranges : AG-node Range is considered consistent 1). Ordinal data is preferred for the elements contained 

over the entire G-node, and therefore has the same kind of and ordered within the sequential data-structure by the 

notation at each P-node. Because the actual values of the sequential method and/or rules of ordering. 

range may be explicit or implicit, the Range is indicated by 2). The sequential data-structure is preferred to be capable 

a function reference [R(G-node)]. The parameter G-node is 15 of representation by nodes containing data elements and 

expressed in the manner appropriate to the given example. links that relate the nodes according to the relationships of 

If the function receives an element parameter, it may then be the elements contained in the nodes. The relationships 

used to compare the element to ranges to determine proper represented by the links may relate the node to adjacent 

placement of the element. The function R() may be called by nodes, non-adjacent nodes and/or to the rest of the data- 

any processor and may use any other additional parameters 20 structure as a whole. 3). The adapted nodes are preferred to 

needed to calculate the Range. In the preferred instance, the have the capability of the calculation of G-node Ranges 

result is a minimum and maximum value allowable for the which may be related to each other in an ordinal fashion. 4). 

P-node and/or G-node: R(A^ io ^_^ (fl(>n ),R(a ol - 2 )}= Contiguity: the structure is preferred to have the quality that 

{minimum, maximum}; the naught in the third position may the placement of nodes makes the data ranges contiguous 

generally be replaced with a 1 or a 2 indicating the limits of 25 with respect to the structure and rules of the graph and the 

the Range; most G-node Ranges consist of two values. data set contained and organized according to its ranges. 

Example (for integer type elements): G-node T o5oi has _ - . „ , 

a n nrr \ rn/t \ nt* w ne n A t u-„ Preferred Embodiment 

G-node Range R^MRO^), R(t oS2 )}-{75, 116}; this D rf ti) 

means that G-node T o5o a may contain elements between 75 _ r *u- • * a *u *u a * 

, . , f u • * * *u a i The purpose of this section is to describe the data- 

and 116 in value and still be consistent with the data- 30 , x r r , - . , 4 . . . Al _ iL , 

. _ ,. y— , j n i * * j structures and functions in a less technical manner than that 

structure rules. On diagrams, G-node Ranges are depicted . . , , * • j • *t. 

. . j „ , ,„ j w L- l ,l of the pseudo -code contained in other sections, 

parenthesized under the P-nodes (G-nodes) to which they r 4 . t ■ , ,. , , ^ 

r j \ / j This section contains only a generalized description oi the 

, .rnj w j t_ *i_ * /i j present method and does not contain all the details of the 

G-nodes — a set of P-nodes related by their G-node f . „ _ , ^ ,. . , • 

„ * f . i ' *■ a *u. 1 « * if invention. For ease of understanding, the description m this 

Ranees, if the processor number subscript and the element 35 . . . . f' . c ^ a * 

*? • * a * i i_* »l iL section is presented using graphical depictions or the data- 
number subscript of a set member are naught, then the ■ . ■ 1 / ■ i \ r i 

. r e „ * T* * cr* a structures m their sequential (single-processor) forms along 

representation is of a G-node. The set T as a set or G-nodes: . ? . , . % +l f , ^ tl ~ 

T-lt t t t \The eranhical representation of a with accompanying descnptions of the graphical figures; the 

I -{t i„ t o2 t o3o9 t o4 J lUe graphical representation or. a llelized forms of the dat a-structures are then depicted in 

G-node is rather unique, being distributed amongst proces- f, „ _ . t , t . 4 f. , . 

, t . ^ ' c + a*\ * • *u * t * n the same fashion. It may serve as an introduction to the basic 

sors around the page. FIGS. 1 and 2 contain the same set of 40 4 c . f iU , t t . , . 1t „ - 

■% . -it.* a tiiu. concepts of the present method so that the reader will find 

data values on a serial b-tree and a parallel b-tree respec- . * j • *■ • * e n 

i nr- . • a /r , i a a t) the other descnptions easier to follow, 

tively. FIG. 2 contains five (5) G-nodes: A olot A^ G , B olo , U1 ^, 4 1 , . t . , . 4 , . tU 

~ 3 ^ A . . ii 1 t> . • ci ♦ • The problem that the invention solves is presented in the 

C aloy D ola . Assuming the parallel B-tree m FIG. 2 contains . r _ . . , _ , , * . .... 

*u .or** j i ^ , . t t . t e section Discussion of Problem. The example given in that 

the set S of integer data elements, we depict the set S as - 4 . . *t_ i • 

. ^ , a r> a c t * n r> section functions in the same manner as the examples given 

data-structure nodes, G-nodes, and P-nodes: S=tA, B, C, 45 , . llf . . ± . . Al . A . . . f L & j 

Y)\ rj\ a \ /T3 \ fr \ Sn \\ fUA here, but the description in this section explains the unde r- 

A I / A°' A \l /r R \° /r • r ' 1 In ' functionality that produces the results shown in the 

n 1 1 t ^ J2 » , , A ? 01 ^ ^11^.^1-^' previous section. In addition, two Examples are shown here 

D 21o }}, G-node A o2 _ comprising P-nodes A 12o and A^ is f , , . . , , 

a M a- % to ensure that the general concept is understood to apply to 

identified in FIG. 2. . r j * * a i * j • *• * 
u /n -in *> dj™ various types or data-structures. A complete description oi 

P-nodes — assume a number (P>1) of processors: P-nodes so . lt , . , i .t . r 
assume multiple elements; P-nodes have three(3) subscripts *P P r6S f^ "l 6 " 1011 ; th f basic parallel method, its funchon- 
(processor-number, node-number, element or cell number). a ^ data^tructures that it produces are described m 
A P-node has multiple cells for elements; when the element- 
number subscript is specified, the reference is to a specific Preferred Embodiment 
cell within the P-node; when the element- number subscript 55 Description of the Shapes of Single -processor Data- 
is naught (o), the reference is to the entire P-node. Reference Structures and their Multi-processor Counter-parts 
to P-node cells: (on processorl) T«{t ni , t 112) t 123 , t 122 , t 131 , This section describes the configuration of adapted par- 
t 132 }; Reference to P-nodes: T«{t 110 t 120 , t 130 }. For greater allel data-structures, how they are stored on multiple pro- 
convenience, P-nodes may be identified by a node letter and cessors or memory storage devices, and how they are similar 
non-subscripted processor number (e.g. Al, A2, Bl, B2, eo to their single-processor counter-parts. The values of the 
etc.) See G-nodes. data elements in a single-processor data-structure determine 

Sets^ — S, P, G-nodes and sets of elements. Sets are des- its shape, [see single B-tree FIG. 1] The single-processor 

ignated by upper case letters; members of sets are generally B-tree in FIG. 1 contains 12 distinct values and has 12 

designated by lower case, subscripted letters; distinct positions for those values. The numerical relation- 

S-nodes — nodes within a sequential data-structure. 65 ships (greater-than/Iess- than) between the 12 elements in the 

S-nodes in a set generally have only one subscript S»{S 1 ,S 2 , B-tree in FIG. 1 determine the shape of the tree and the 

S 3 , . . . }; positions of the elements. 



04/01/2003, EAST Version: 1.03.0002 



6,1: 

13 

The values of the data elements in a multi-processor 
data-structure also determine its shape; however, they deter- 
mine its shape according to the ranges of values into which 
they fall [see parallel B-tree FIG. 2]. Although the same 12 
elements populate the parallel B-tree in FIG. 2, the shape of 
the parallel B-tree is not determined by the positions of 12 
distinct elements, but by the positions of 5 distinct ranges 
that the 12 elements fall into. 

Comparing the trees in FIG. 1 and FIG. 2, we see that the 
elements 20 and 26 occupy node C in FIG. 1. The contents 
of node C are determined by the fact that the parent node A 
of the tree contains the two values 15 and 30: therefore all 
elements greater than 15 and less than 30 are placed in node 
C. The Adapted parallel version of the tree in FIG. 2 also has 
a node C; however, the parallel node C has two parts CI (on 
processor 1) and C2 (on processor 2) . The contents of the 
parallel node C are determined by the fact that the parallel 
parent node A contains two ranges of values (15 to 20) and 
(40 to 45): therefore all elements greater than (15 to 20) and 
less than (40 to 45) are placed in parallel node C. Therefore 
the elements in parallel node C fall into the range (21 to 39); 
these elements are 26,30 and 33. 

The parallel B-tree in FIG. 2 is composed of two identi- 
cally shaped trees (one on each processor). The elements in 
these identical trees are also positioned identically within the 
tree according to the ranges that they fall into. This grouping 
of elements according to identical ranges on each processor 
creates a Global-node or G-node: the G-node is a collection 
of data elements in identical positions within identical 
data-structures contained on multiple processors or process- 
ing components. Each G-node has its range (G-node Range) 
recorded on each processor. The G-node with G-node range 
(40 to 45) is positioned as the second entry in node A in FIG. 
2. If the value 43 were Inserted by either processor into the 
parallel B-tree, then it would take position in this G-node 
because it falls into the G-node range (40-45). This G-node 
would then contain the values 40,43, and 45. The concept of 
the G-node is central to the functioning of the parallelized 
method/data -structure: once the concepts of the G-node, the 
G-node Range and the G-node Split are firmly grasped, the 
present method should be fairly easy to comprehend. The 
G-node Split is explained in the following section. 

Preferred Embodiment 
Verbal Description (Insert and Remove) 

This section gives a verbal description of how the pre- 
ferred embodiment functions. Adapted parallel data- 
structures created by the preferred embodiment are always 
composed of P identical data-structures, each contained on 
one of P processors or system components. The adapted 
parallel data -structures take form and are organized accord- 
ing to the same principles (with respect to G-node Ranges) 
that form and organize the single -processor data-structures 
from which the parallel versions are derived. 

As mentioned t previously, the single-processor data- 
structures to be adapted are created and maintained through 
the use of Insert and Remove functions for the respective 
data-structures. The ability to Insert and Remove from 
ordered lists of data implies the ability to search. Search 
(Find) functions are preferred aspects of the single-processor 
Insert and Remove functions in general. 

The multi-processor Insert, Remove and Find functions 
may be originated at any time, on any processor (1 to P)'. The 
processor originating the Insert, Remove or Find function 
may or may oot need to involve the other processors in the 
effort. In some cases these functions can be executed by a 
single processor within the parallel or distributed system. 
Whether or not other processors need to be involved, 



8,123 

14 

depends on how much room there is in a G-node for Insert 
and whether or not a specific value is present on a given 
processor for Find or Remove. / 
Fnrj his gene ra .1 description, the parallel v ersions of they 

5 Insert and Remove functions may be said to nave ihr ee 
^p hases: ( 1 ) Lo cation oi me p roper G- no ae on tne p aginatin g 
p rocessor ( 2) Lo cation of the prope r processor (1 throug h"!*) 
with insertion or removal of the elem ent on that rj rocessor 
( 3r Pertorr naric^ of fi-nnfte splir r>r-^. P ode deletion if 
neoess*ar y,. Step 1 can be performed by any single-processor, 

1 aUai wtime. independ ent nf the nth? (T p rnr pcgr>r^ Step 2 
■involves more than one processor in a cooperative effort 
unless the "proper processor" is the processor that originated 
the Insert or Remove. Step 3 usually requires all processors 
to communicate for a G-node Split because the elements in 

15 the G-node must be sorted across processors for a Split; Step 
3 usually does not require all processors to communicate for 
a G-node deletion. 

The following steps 1 through 3 are also identified in the 
pseudo-code for the parallel Insert and Remove functions 

20 given in the Program Adaptation section. 
Step 1 

(Location of the proper G-node on the originating pro- 
cessor (Find G-Dode)) 

The functioning of the parallelized method depends on the 

25 functioning of the single-processor method. The single- 
processor method functions according to the relationships 
between the values of the elements: the multi-processor 
method functions according to the relationships of the 
ranges of values of the elements. 

30 The search of an ordered list is performed by comparing 
the values found at positions within the data-structure. For 
Example: Searching the single-processor B-tree for 33 in 
FIG. 1, we start at the top node and compare the values. 33 
falls between 30 and 45, so we travel down the link under 

35 45 and find node D. Searching node D from left to right we 
immediately locate 33. Searching the multi- processor 
B-tree for 33 in FIG. 2, we start at the top node and compare 
the ranges. 33 falls between the ranges (15 to 20) and (40 to 
45), so we travel down the link under (40 to 45) and find 

40 node C. Parallel node C may be located by either processor 
1 or processor 2. Searching parallel node C for the value 33 
is described in Step 2, 
Step 2 

(Location of the proper processor (1 through P)) Once a 

45 given processor p has successfully located the proper 
G-node within its data-structure, it may then send the 
location of this G-node to the other processors in the system. 
Each of these processors may then attempt to locate the 
search value within its own portion of the G-node or attempt 

50 to place a value in the proper G-node. 

In Step 1 above, we located G-node C (FIG. 2) as the 
proper node for 33. If the originating processor is processor 
1, it sends a request to processor 2 to search G-node C; 
processor 2 then searches its portion of G-node and finds the 

55 value; it may then Insert or Remove the value from the 
data-structure. If the originating processor is processor 2, it 
immediately locates the value 33 and need not make any 
request of processor 1. 

Whether or not the originating processor needs to send 

60 requests to other processors for location of values is depen- 
dent on the ordering of values within the G-node. The 
data-structures in FIG. 2 have G-nodes with unordered 
internal values. For a discussion on ordering values within 
G-nodes, see other sections. 

65 Step 3 

(Performance of G-node Split or G-node deletion if 
necessary) A G-node Split is the creation of a new G-node; 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 

15 16 

a G-node deletion is the destruction of an existing G-node. The following two Examples explain the underlying func- 

When a G-node is considered full, it is Split; when it is tioning of the preferred embodiment by maintaining the 

considered empty (or sufficiently empty), it is destroyed and same list of values on two parallelized data-structures. Each 

deleted from the data-structure. The fullness or emptiness of Example first describes the functioning of the single- 

a G-node is similar in conception to the fullness or emptiness 5 processor version of the data-structure on a similar list. Each 

of a node in a single-processor B-tree or m-way search tree. Example then describes the parallel version of the data- 

The Rule for Fullness in this section is set forth in detail structure. These Examples both use three (P=3) processors 

b e j ow for the parallel data -structures. 

™_ ^ j o i *i * j j. , jc ' i' Th e single-processor versions of the lists are roughly 
The G-node .Split is described in the definition section 1Q onc . third the size of the paraUel versions. Each single value 
above; this definition is sufficient for the General Descnp- inserted 0f deleted ifl the singlc . processor data-structure is 
tion. For a more detailed description, see the Function matchcd by vahies for the parallcl vcrsion . ^ 
Explanation sections. The G-node deletion is simply the multiple values inserted and deleted in the parallel version 
removal of the G-node from the data-structure. are specifically chosen to fall into the proper G-node Ranges 
Once a G-node is created or destroyed, it must be added 15 so that the single and multi-processor data-structures take on 
or deleted from the given data-structure according to the the same configurations: this is done so that the identical 
Rules for the data-structure with respect to the G-node functioning, form and structure of the single and multi- 
Ranges. Examining the data-structure on processor 1 in FIG. processor versions can be easily seen. The functioning of the 
2, we see that it is a valid B-tree, in its own right, regardless parallel versions is in no way dependent on any choice of 
of the existence of other processors: if we remove the value 20 element values (any list of ordinable data elements may be 
5 from this serial B-tree, we produce the B-tree on processor Inserted or Removed in any order). 
1 in FIG. 3. When both processor 1 and 2 perform this EXAMPLE 1 
removal simultaneously, each processor redistributes the Single-processor Method 

B-tree according to the rules of B-tree configuration: the The single-processor AVL tree method is composed of 

result is a G-node deletion. Note that this would require the finding th& proper location for a new node, adding that node, 

absence of all three values in the G-node: 4, 5, and 12. The and performing rotation. 

point being made here is that G-node additions and deletions Example 1 begins with FIG. 4, showing a properly 

function according to the same rules as the single-processor ordered single-processor AVL tree containing the elements 

data-structures. This process is clarified further in the two 3Q from the single -processor initial list, 

examples in the following section. l.) Insert(60) 

Comparing values at each node: Root-node A: [60>40], 

Preferred Embodiment travel down the right-most link to node C; node C: [60>50], 

Descriptions by Example travel ri S ht to node F i node F: [60<70] — F has no left link 

a • i , v . , . , , -it so we create a new node G and place it to the left of node 

A smgle -processor method creates its data-structure (such 35 p ^ F 

as a B-tree) by Inserting and Removing the values contained ' , ' \ t , 

in the list to be maintained according to the Rules of the . ^, de ° ^ be f n „ add f in lts P r °P er P^ e; ,he AV ^ 

Insert and Remove functions for the data-structure. The * left unbalanced ' ^reiore we perform RL rotation (FIG. 

Method of Adapting single-processor methods and their '* 

associated data-structures into rnulti -processor methods and 40 msert^u; 

data-structures makes use of the single-processor method. r R °ot;node A: [80>40] travel right to node G; node G: 

Each of the following two Examples will create precisely the [ 80>6 °] travcl n S ht to Dode ¥ > node F: [«0>70^-F has no 

same configurations for their data -structures in both the n S ht hnk 50 we crcate a " cw node H and place it to the right 

single and multi-processor versions described. A reader of node R ^ trce * stlU balanced (no rotation) (FIG. 7). 

understanding the functioning of AVL trees and B-trees 45 ) Irisert(90) 

should be able to see the functioning of the multi-processor Root-node A: [90>40], travel right to node G; node G: 

method as a transformation of the single-processor method [90>60], travel right to node F; node F:[90>70], travel right 

in each case. The present method transforms the single- to node H > Dode H: [90>80] H has no right link, so we 

processor method into the multi-processor method. create a new node 1 and P lace il t0 the right of H (FIG. 8). 

Any implementation of a parallelized data-structure may 50 . *° de ] k h 5 s be ™ * dded in its ^ f cr pl " C D C; ^ AV ^ C 

utilize variations on the Rules for Fullness of G-nodes and » Ieft unbalanced > we perform RR rotation (FIG. 
the Ordering Scheme of elements within G-nodes. The 

following rules and ordering scheme will be used for these Rem ° ve (°°) 

two Examples: Root-node A: [60>40], travel right to node G; node G: 

55 [60=60], therefore delete node G; replace node G with the 

1. Rules for fullness/emptiness: each G-node in these left-most child of the right sub -tree (node F). The tree is still 
Examples will be composed of 3 sets of elements; each balanced (no rotation) (FIG. 10). 

set will contain zero through two elements. A G-node Multi-processor Metho d 

i is full when all three sets contain two elements (the . The multi-processor AVL tree method is composed o f 

G-node therefore containing six elements). The G-node 60 finding the proper location for a n ew value. insertimTthe 

is empty when all three sets contain zero elements. v ahies until a G-node Split thereWcreating a new G-nod e, 

2. Ordering Scheme: each P-node set within a G-node is a dding that G-node, and then performing H)\n tinn '"-praT 1 " 1 
contained on a single-processor (PI, P2 or P3 ). The jRefer to Steps 1 through 3 in the Verbal Descriptio n 
elements in the G-node will be kept in ascending order section, 

across the processors and evenly distributed (PI con- 65 Example 1 (multi-processor) begins with FIG. 11, show- 

taining the smallest values, P2 the mid-most, P3 the ing an Adapted AVL tree composed of 3 properly ordered 

largest). AVL trees on 3 processors containing the elements from the 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 

17 18 

multi-processor initial list. The G-node Ranges are shown Split divides the values 74,75,78,80,85,89 into two ranges. 

beneath the parallel nodes. The lower range is placed into a newly created G-node 

1.) Insert(60) [after which Insert(65)] which is in the AVL tree node (F), the upper range is kept in 

Insert(60) at processor PI the G-node in the newly formed AVL tree node (H) (FIG. 

Step 1 5 20). 

Comparing values at each G-node: AVL tree node (H) containing the old G-node is 

Root-node Al: [(60)>(40-49)], travel down the right- placed to right of node F because its G-node Range (79-max) 

most link to node CI; node CI: [(60)>(50-59)] travel right fe ter than the G . node R of p (71 _ 78) (F]G 21) ^ 

to node Fl; node Fl: [(60) =(60-rnax)]-60 falls within addition of the new node H leaves the AVL tree balanced (no 

G-node Range (60-max), so we add 60 to this G-node at 1Q rota ^ on ^ 

Ste C T° r PL 3 ") Insert(98),Insert(95),Insert(90) 

■Sic values are properly ordered within the G-node, so These three Insertions are performed simultaneously. 

step 2 is not necessary. Because the Insertions at processors PI, P2, and P3 all 

Step 3 follow the same procedure, we will follow Step 1 only at 

The G-node is NOT full, so Step 3 is not necessary (no 15 processor P3. 

G-node Split). (FIG. 12) Insert(98) at PI, Insert(95) at P2, Insert(90) at P3 

Insert (65) at Processor PI Step 1 at Processor P3 

Step 1 Root-node A3: [(90)>(40-49)], travel right to node G3; 

Root-node Al: [(65)>(40-49)] travel right to node CI; node G3: [(90)>(60-70)], travel right to node F3; node F3: 

node CI: [(65) > (5 0-5 9)] travel right to node Fl; node Fl: 20 [(90)>(71-78)], travel right to node H3; node H3: [(90)= 

[(65)=(60-max)]— 65 falls within G-node Range (60-max), (79-max)]— 90 falls within G-node Range (79-max), so we 

so we add 65 to this G-node at processor PI. (FIG. 13) add 90 t0 this G . node at processor P3 . Following identical 

^ te P ^ comparisons at processors PI and P2, the values 98 and 95 

As FIG. 13 shows, the G-node Fl at processor PI has 3 have been added tQ the same G . node ^ 22) 

values. This is greater than maximum number of values per 25 g te 2 

processor per G-node, so we perform Step 2 to properly ^ mG ^ &h ^ values ^ aQi eJ ^ 

order the values within the G-node in R Processor PI sends ^ Qrder ^ G . nf)de m H we fonn St 2 

the value 70 to P2, and P2 sends 75 to P3. Tins exchange of Processor pl sends the value 98 to p3 and p2 sends 85 tQ 

elements maintains the Ordering Scheme within G-nodes pl an(J also 95 to p3 p3 ^ 89 amJ 9Q to p2 ^ 

aboveWFIG^r Pr0CeSSOrS f ° r e3Ch V3lUe (fUle 2 lKted 30 exchange of elements maintains the Ordering Scheme within 

* ^ ^ ' ^ G-nodes, locating the proper processors for each value (rule 

L AA-r *u ■ n a >u n a > *n 2 listed above) (FIG. 23). 

After addition of 65 to this G-node, the G-uode is full, gte 3 

therefore we perform a G-node Split. The G-node Split fno nc - nrw tU . ~ A tU „ , 

divides the values 60,65,70,74,75,78 into two ranges. The 35 . r # ? * ° n ° ~L 3 J? f ^ n ? f t ^ C 5" n ° j C 

lowerrangeisplacedintoanewl y createdG-node(inG),the * f? U J ^ L Q To/o! P " 

upper range is kept in the G-node in F (FIG. 15) 5f 1 dlVldes the Vall ! eS 80,85,89,90,95,98 into two ranges. 

-5^ % i ^ # #u i #* # *u a The upper range is placed into a newly created G-node (I). 

The new G-node (in G) is placed to the left of the G-node , * r & . , . . _ , J . TT _ w ' 

. • •* *-> a n tein. nn\ * i *u *u the lower range is kept in the G-node in H (FIG. 24). 

in F because its G-node Range (60-70) is less than the m jr , . , - . , v - ^ , 

G-node Range of F (71-max) (FIG. 16). 40 ^ new G ' Q0 ^ 0) 15 P^jcd to nght of G-node H 

Hie addition of the new node F leaves the AVL tree because its G-node Range W-max) ^greater than the 

unbalanced as it did in the single-processor example, there- G ' no ^ R f B* of H 0^9) (FTG. 25) The addition of the 

fore we perform RL rotation in parallel (FIG. 17). 2.) new G-node I leaves the AVL tree unbalanced as it did m the 

Irisert(80),Insert(89),Insert(85) single-processor example, therefore we perform RR rotation 

These three Insertions are performed simultaneously. 45 m P arallel ( mG ' 26 )- 

Because the Insertions at processors Pl, P2, and P3 all 4 ) Remove(55), Remove(65), Remove(70), [after which 

follow the same procedure, we will follow Step 1 only at Remove(60)] 

processor P2. Remove(55) at Pl, Remove(65) at P2, Remove(70) at P3 

Insert(80) at Pl, Insert(89) at P2, Insert(85) at P3 (performed simultaneously) 

Step 1 at Processor P2 50 Remove(55) at Pl 

Root-node A2: [(89)>(40-49)], travel right to node G2; Step 1 

node G2: [(89)>(60-70)], travel right to node F2; node F2: Root-node Al: [(55)>(40-49)], travel right to node Gl; 

[(89)=(71-max)}— 89 falls within G-node Range (71-max), node Gl: [(55)<(60-70], travel left to node CI: node CI: 

so we add 89 to this G-node at processor P2. Following [(55)-(50-59)}— 55 falls within the G-node Range (50-59), 

identical comparisons at processors Pl and P3, the values 80 55 so processor Pl looks in node CI for the value 55. 55 is not 

and 85 have been added to the same G-node (FIG. 18). i n node CI at processor Pl, so we must perform Step 2. 

Step 2: Step 2 

As FIG. 18 shows, the values are not arranged in ascend- pi a request to the other proce ssors to look for 55 in 

ing order within the G-node in F, so we perform Step 2. their respective nodes a and C3. Processor P2 finds 55 in 

Processor Pl sends the value 80 to P2, and P2 sends 75 to 60 its node p2 rera oves the value 55 from node C2 and sends 

Pl and also 89 to P3, P3 sends 78 to P2. This exchange of H to Pl (FIG 27) 

elements maintains the Ordering Scheme within G-nodes, steo 3 

locating the proper processors for each value (rule 2 listed 1 j ^ * . * j c^-.- 

above) (FIG 19) G-node C is not empty and so Step 3 is not necessary. 

Step 3: 65 Remove(65) at P2 

After addition of 80, 89 and 85 to this G-node, the G-node Root-node A2: [(65)>(40-49)], travel right to node G2; 

is full, therefore we perform a G-node Split. The G-node node G2: [(65)=(60-70)]—<55 falls within the G-node 



04/01/2003, EAST Version: 1.03.0002 



6,1: 

19 

Range (60-70), so processor P2 looks in node G2 for 
the value 65 and finds 65 in G2. 
P2 then removes 65 from node G2. 
Step 2 

The value that P2 searched for (65) was found at P2, 
therefore Step 2 is not necessary (FIG. 27). 
Step 3 

G-node G is not empty and so Step 3 is not necessary. 
Remove(70) at P3 

Root-node A3: [(70)>(4O-49)], travel right to node G3; 
node G3: [(70)=(60-70)]— 70 falls within the G-node 
Range (60-70), so processor P3 looks in node G3 for 
the value 70 and finds 70 in G3. 

P3 then removes 70 from node G3. 
Step 2 

The value that P3 searched for (70) was found at P3, 
therefore Step 2 is not necessary (FIG. 27). 
Step 3 

The G-node in G is not empty and so Step 3 is not 
necessary. 

Remove(60) at P2 

Root-node A2: [(60)>(40-49)], travel right to node G2; 
node G2: [(60= (6 0-70)]— 60 falls within the G-node 
Range (60-70), so processor P2 looks in node G2 for 
the value 60. 60 is not in node G2 at processor P2, so 
we must perform Step 2. 
Step 2 

P2 sends a request to the other processors to look for 60 
in their respective nodes Gl and G3. Processor PI finds 60 
in its node Gl. PI removes the value 60 from node Gl and 
sends it to P2 (FIG. 28). 
Step 3 

The G-node in G is empty and so we perform Step 3. The 
removal of the G-node G is simply a matter of each of the 
processors PI, P2 and P3 performing a normal AVL node 
removal. PI removes Gl from its tree; P2 removes G2 from 
its tree; P3 removes G3 from its tree. Each of the processors 
re-orders the tree according to the single-processor AVL tree 
method and replaces node G with the left-most child of the 
right sub-tree and performs range adjustment (node F) (FIG. 
29). 

EXAMPLE 2 
Single -processor Method 

The single-processor B-tree method is composed of find- 
ing the proper location for a new value, adding that value, 
and performing B-tree splits when the B-tree nodes are full 
(contain 3 values). 

Example 2 begins with FIG. 30, showing a properly 
ordered single-processor B-tree (degree 3) containing the 
elements from the single-processor initial list. 
1.) Insert(60) 

Comparing values at each node, moving through the 
tuples from left to right: 

Root-node A: [60>20], move right; [60>40], travel down 
the right-most link to node D; node D: insert 60 
between 50 and 70 at node D (FIG. 31). 

D now has 3 values and must be split. The right-most 
value goes in the new node (E). The left most value is kept 
in node D; the middle value (60) becomes the parent value 
of D and is re-inserted at the parent node A (FIG. 32). 

The parent node A (root-node) now has 3 values and must 
be split. The right-most value goes in the new node (F). The 
left most value is kept in node A; the middle value (40) 
becomes the parent value of A and is re-inserted at the parent 
(no parent exists for the root, so a new root is created — node 
G) (FIG. 33). 



18,123 

20 

2.) Insert(80) 

Root-node G: [80>40] travel right to node F; node F: 
[80>60] travel right to node E; node E: insert 80 after 70 at 
node E (FIG. 34). 
5 3.) Insert(90) 

Root-node G: [90>40], travel right to node F; node F: 
[90>60], travel right to node E; node E: insert 90 after 80 at 
node E (FIG. 35). Node E now has 3 values and must be 
split. The right most value goes in the new node (H). The left 
most value is kept in node E; the middle value (80) becomes 
the parent value of E and is re-inserted at the parent node F 
(FIG. 36). 
4.) Remove(60) 

Root-node G: [60>40], travel right to node F; node F: 60 
is found at node F and removed. This leaves F with too few 
15 values, so it removes node E, places its value (70) in node 
D, and makes 80 the parent value of node D (FIG. 37). 
Multi-processor Method 

The multi-processor B-tree method is composed of find- 
ing the proper location for a new value, inserting the values 
20 until a G-node Split thereby creating a new G-node, adding 
that G-node, and performing B-tree splits when the B-tree 
nodes are full (contain 3 G-nodes). (The G-nodes constitute 
the elements of the B-tree.) 

Refer to Steps 1 through 3 in the Verbal Description Section 
25 Example 2 (multi-processor) begins with FIG. 38, show- 
ing an Adapted B-tree composed of 3 properly ordered 
B-trees on 3 processors containing the elements from the 
multi-processor initial list. The G-node Ranges are shown 
beneath the parallel G-nodes. 
30 1.) Insert(60) [after which Insert(65)] 
Insert(60) at processor PI 
Step 1 

Comparing values at each node, moving through the 
tuples from left to right: 
35 Root-node Al:[(60)>(20-29)], move right; [(60)> 
(40-49)], travel down the right-most link to node Dl; 
node Dl: insert 60 into right-most G-node in node D. 
Step 2 

The values are properly ordered within the G-node, so 
40 step 2 is not necessary. 
Step 3 

The G-node is NOT full, so Step 3 is not necessary (no 
G-node Split). (Note that although node Dl has three values, 
it contains only 2 G-nodes and therefore does not need a 
45 B-tree split.) (FIG. 39) 

Insert (65) at processor PI 
Step 1 

Root-node Al: [(65)>(20-29)] move right; [(65)> 
(40-49)], travel down right-most link to node Dl; node Dl: 
50 insert 65 into second G-node in node D (FIG. 40); 
Step 2 

As FIG. 40 shows, the second G-node in Dl at processor 
PI has 3 values. This is greater than maximum number of 
values per processor per G-node, so we perform Step 2 to 

55 properly order the values within the G-node, Processor PI 
sends the value 70 to P2, and P2 sends 75 to P3. This 
exchange of elements maintains the Ordering Scheme within 
G-nodes, locating the proper processors for each value (rule 
2 listed above) (FIG. 41). 

60 Step 3 

After addition of 65 to this G-node, the G-node is full, 
therefore we perform a G-node Split. The G-node Split 
divides the values 60,65,70,74,75,78 into two ranges. The 
lower range is placed into a newly created G-node, the upper 
65 range is kept in the existing G-node (FIG. 15). 

The new G-node is placed to left of the existing G-node 
because its G-node Range (60-70) is less than the G-node 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 



21 



22 



Range (71-max) (FIG. 42). D now contains 3 G-nodes and 
must be split (B-tree split). The right most G-node goes in 
the new B-tree node (E). The left most G-node is kept in 
B-tree node D; the middle G-node with G-node Range 
(60-70) becomes the parent value of D and is re-inserted at 
the parent node A (FIG. 43). The parent node A (B-tree- 
root-node) now has 3 G-nodes and must be split. The right 
most G-node goes in the new B-tree node (F). The left most 
G-node is kept in node A; the middle G-node (40-49) 
becomes the parent value of A and is re-inserted at the parent 
(no parent exists for the root, so a new root is created — B- 
tree node G) (FIG. 44). 

2.) Insert(80),Insert(89),Insert(85) 

These three Insertions are performed simultaneously. 
Because the Insertions at processors PI, P2, and P3 all 
follow the same procedure, we will follow Step 1 only at 
processor P2. 

Insert(80) at PI, Insert(89) at P2, Insert(85) at P3 Step 1 
at processor P2: 

Root-node G2: [(89)>(4CM9)], travel right to node F2; 
node F2: [(89)>(60-70)], travel right to node E2; node E2: 
[(89)-(71-max)]— 89 falls within G-node Range (71-max), 
so we add 89 to this G-node at processor P2. Following 
identical comparisons at processors PI and P3, the values 80 
and 85 have been added to the same G-node (FIG. 45). 
Step 2 

As FIG. 45 shows, the values are not arranged in ascend- 
ing order within the G-node at E, so we perform Step 2. 
Processor PI sends the value 80 to P2, and P2 sends 75 to 
PI and also 89 to P3, P3 sends 78 to P2. This exchange of 
elements maintains the Ordering Scheme within G-nodes, 
locating the proper processors for each value (rule 2 listed 
above) (FIG. 4<T). 
Step 3 

After addition of 80, 89 and 85 to this G-node, the G-node 
is full, therefore we perform a G-node Split. The G-node 
Split divides the values 74,75,78,80,85,89 into two ranges. 
The lower range is placed into a newly created G-node, the 
upper range is kept in the existing G-node (FIG. 20). 

The new G-node is placed to left of the existing G-node 
because its G-node Range (71-78) is less than the G-node 
Range (79-max) (FIG. 47). 
3.) Insert(98)Jnsert(95),Insert(90) 

These three Insertions are performed simultaneously. 
Because the Insertions at processors PI, P2, and P3 all 
follow the same procedure, we will follow Step 1 only at 
processor P3. 

Insert(98) at PI, Insert(95) at P2, Insert(90) at P3 
Step 1 at processor P3; 

Root-node G3: [(90)>(4O-49)], travel right to node F3; 
node F3: [(90)>(60-70)], travel right to node E3; node 
E3: [(90)>(71-78)], move right; [(90)=(79-max)]— 90 
falls within G-node Range (79-max), so we add 90 to 
this G-node at processor P3. Following identical com- 
parisons at processors PI and P2, the values 98 and 95 
have been added to the same G-node (FIG. 48). 
Step 2 

As FIG. 48 shows, the values are not arranged in ascend- 
ing order within the G-node at E, so we perform Step 2. 
Processor PI sends the value 98 to P3, and P2 sends 85 to 
PI and also 95 to P3, P3 sends 89 and 90 to P2. This 
exchange of elements maintains the Ordering Scheme within 
G-nodes, locating the proper processors for each value (rule 
2 listed above) (FIG. 49). 
Step 3 

After addition of 98, 95 and 90 to this G-node, the G-node 
is full, therefore we perform a G-node Split. The G-node 



Split divides the values 80,85,89,90,95,98 into two ranges. 
The lower range is placed into a newly created G-node, the 
upper range is kept in the existing G-node (FIG. 24). 
The new G-node is placed to left of the existing G-node 

5 because its G-node Range (79-89) is less than the G-node 
Range (90-max) (FIG. 50). Node E now has 3 G-nodes and 
must be split (B-tree split). The right most G-node goes in 
the new node (H). The left most G-node is kept in node E; 
the middle G-node (79-89) becomes the parent G-node of E 

10 and is re -inserted at the parent node F (FIG. 51). 

4.) Remove(55), Remove(65), Remove(70), [after which 
Remove(60)] 

Remove(55) at PI, Remove(65) at P2, Remove(70) at P3 
(performed simultaneously) 
15 Remove(55) at PI 
Step 1 

Root-node Gl: [(55)>(40-49)], travel right to node Fl; 
node Fl: [(55)<(60-70], travel left to node Dl: node Dl: 
[(55M50-59)]— 55 falls within the G-node Range (50-59), 
20 so processor PI looks in node Dl for the value 55. 55 is not 
in node Dl at processor PI, so we must perform Step 2. 
Step 2 

PI send a request to the other processors to look for 55 in 
their respective nodes D2 and D3. Processor P2 finds 55 in 
25 its node D2. P2 removes the value 55 from node D2 and 
sends it to PI (FIG. 52). 
Step 3 

The G-node in D is not empty and so Step 3 is not 
necessary. 

30 Remove (65) at P2 Root-node G2: [(65)>(40-49)], travel 
right to node F2; node F2: [(65)=(60-70)]— 65 falls 
within the G-node Range (60-70), so processor P2 
looks in node F2 for the value 65 and finds 65 in F2. P2 
then removes 65 from node F2. 

35 Step 2 

The value that P2 searched for (65) was found at P2, 
therefore Step 2 is not necessary (FIG. 52). 
Step 3 

The G-node in F is not empty and so Step 3 is not 
40 necessary. 

Remove(70) at P3 

Root-node G3: [(70)>(4CM9)], travel right to node F3; 
node F3: [(70)=(60-70)]— 70 falls within the G-node 
Range (60-70), so processor P3 looks in node F3 for 
the value 70 and finds 70 in F3. P3 then removes 70 
from node F3. 
Step 2 

The value that P3 searched for (70) was found at P3, 
5Q therefore Step 2 is not necessary (FIG. 52). 
Step 3 

The G-node in F is not empty and so Step 3 is not 
necessary. 

Remove(60) at P2 
55 Root-node G2:[(60)>(4CM9)], travel right to node F2; 
node F2: [(60=(60-70)]— 60 falls within the G-node 
Range (60-70), so processor P2 looks in node F2 for 
the value 60. 60 is not in node F2 at processor P2, so 
we must perform Step 2. 
60 Step 2 

P2 sends a request to the other processors to look for 60 
in their respective nodes Fl and F3. Processor PI finds 60 
in its node Fl. PI removes the value 60 from node Fl and 
sends it to P2 (FIG. 53). 
65 Step 3 

The G-node in F is empty and so we perform Step 3. The 
removal of the G-node is simply a matter of each of the 



45 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 

23 24 

processors PI, P2 and P3 performing a normal B-tree-node half of the B-tree-nodes that contain it are ready fox a 

removal. PI removes the G-node from Fl in its tree; P2 merge or deletion. This information can be stored for 

removes from F2; P3 removes from F3. Each of the pro- each parallel B-tree-node outside of the parallel data- 

cessors re-orders the tree according to the single-processor structure (possibly in memory). Once one-half of the 

B-tree method and removes node E, places its G-node 5 parallel B-tree-nodes are ready to split, one of the 

(71-78) in node D, makes (79-89) the parent value of node G-nodes within the B-tree-nodes is split. 

D and performs range adjustment (FTG. 54). 2. Ordering Scheme: This example uses three disk-packs. 

Disk 1 will contain the bottom (smallest) one-third of 

Preferred Embodiment the ran ge of values in a given G-node; Disk 2 will 

Rules for Fullness and Ordering Scheme for B-trees Stored 10 contain the middle one-third; Disk 3 will contain the 

on Disk to P (largest) one-third of the Range. (If the G-node 

The usage of Rules for Fullness and Ordering Schemes is Range were (1-100) then Disk 1 would contain any 

described in the previous sections. Tlie Rule for Fullness and va ue b K etween * anc * ^ Disk 2 would contain any 

Ordering Scheme chosen for those examples assume that the ™ ues te J^° ^ ^ Dlsk3 would contain values 

ii 1 j . . • * . j j • i a r« * i between 68 and 100). 

parallel data-strucuire is not stored on disk. A different rule 15 ^ Rule fof Fullaess )E mptiriess above m i nim i zcs the 

and scheme should be chosen if the processing-elements of Qeed ^ accesg ^ ims J ^ B . tree . node m question 

the parallel data-structure are disk-packs rather than actual because the mformation for determining the fullness of the 

CPU's. It should also be noted here that the terms "proces- pafallel B . tree . node ^ stored external to the tree. The 

sor" and "processing-element" are used to refer to system Ordering Scheme above minimizes the need to access all 

components to which work may distributed in the mainte- 2 o portions of the B-tree-node in question because the location 

nance of the parallel data-structure: in this section the of the proper Disk for a given value in a given Range can be 

processing-elements are assumed to be disk-packs on a calculated mathematically: this allows the direct location 

system with multiple disk drives; the work distributed within memory storage of the exact individual node 

amongst the disk-packs is the actual reading and writing of (P-node) contained in a given G-node that could contain a 

the blocks that contain the parallel B-tree-nodes. 25 given data value within the G-node's G-node Range. This 

In this section, another Example of a parallel data- Example begins with FIG. 55 showing a parallel B-tree 

structure is given. The purpose of the example is to illustrate ordered according to Rules 1 and 2 above. Note that 

the functionality of the Rules for Fullness and Ordering although the same values are stored in the tree in FIG. 55 as 

Scheme chosen for the B-tree stored on disk. The example those stored in FIG. 38, the right-most G-node in the tree is 

describes one possible embodiment of an adapted B-tree. 30 ordered differently: this is because of the Ordering Scheme 

The manner of describing this example is the same as the rule above. At the beginning of this Example none of the 

manner used in the previous sections. B-tree-nodes located in the data-structure are ready to be 

Hie main difference between storing data in memory and split, 

on disk is that disk access is slower. Assuming that the We now proceed to Insert a number of values into the 

location of the memory block or disk block is known, 35 disk-stored B-tree in FIG. 55. 

accessing data on disk might take milli-seconds whereas 1.) Insert(60) on Disk 1 and Insert(71) on Disk 2 Simulta- 

accessing data in memory would take only micro -seconds. neously 

Therefore, the goal in designing data- structures to be stored Step 1 

on disk is to minimize the number of disk accesses necessary Root-node Al: [(60)>(20-29)], move right; [(60)> 

to locate the desired data-block. The goal in the design of the 40 (40-49)], travel down the right most link to Dl; node 

parallel data-structures described in this invention is to allow Dl: insert 60 into right most G-node in Dl. (Disk 2 

the same data-structure to be accessed simultaneously by follows the same pattern) 

multiple processing-elements (or disk-packs in this section) Step 2 

and thus distribute the work amongst the processing- The values are properly ordered within the G-node, so 

elements. Because the goal in designing data-structures for 45 Step 2 is unnecessary, 

disk is to minimize accesses, the Rule for Fullness and the Step 3 

Ordering Scheme of a disk-stored parallel B-tree must be The G-node is not full (no G-node Split)(FIG. 56) 

denned to minimize parallel communication between 2.) Insert(52) at Disk 1, Insert(51) at Disk 2, Insert(59) at 

processing-elements (disk-packs) and provide the most effi- Disk 3 

cient access paths possible to desired P-nodes. Steps 2 and 50 Step 1 

3 described in the Verbal Description require parallel com- (Step 1 is followed at Disk 2 in order to illustrate the 

munication: the parallel communication in Step 2 can be functionality of the Ordering Scheme) 

minimized by choosing an Ordering Scheme that does not Root-node A2: [(51)>(20-29)], move right; [(51)> 

involve all of the disk-packs in locating the proper disk-pack (40-49)], travel down right-most link to node D2; node 

for placement of a value. The Rule for Fullness can also be 55 D2: the value 51 belongs in the G-node with Range 

altered so that determining the fullness or emptiness of a (50-59), 

G-node does not involve all of the disks. Step 2 

The following Rule for Fullness and Ordering Scheme Send the value 51 to Disk 1 and place it in the G-node 

will be used in the example for this section: with Range (50-59). According to the Ordering Scheme, 51 

1. Rule for Fullness/Emptiness: The fullness of a G-node 60 must be sent to Disk 1 because it is in the bottom one-third 

in this Example is dependent on the fullness of the of the Range (50-59). Note that Disk 3 is not involved in 

B-tree node that contains the G-node. A B-tree node is Step 2 because the values contained in node D3 play no part 

considered full when it contains five values (integers) in determining the proper Disk for 51: one disk access is 

and is thereby ready to undergo a B-tree split. AG-node saved by the Ordering Scheme, 

may be considered full when one-half of the B-tree- 65 Step 3 

nodes that contain the G-node are ready for a B-tree B-tree-node D3 now contains five (5) values because of 

split; a G-node may be considered empty when one- the insertion of the value 59. The addition of the fifth value 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 



25 



and resulting fullness of the B-tree-node D3 is recorded. 
Nodes Dl and D2 are still less than full; therefore less than 
one -ha If of the parallel nodes are full, and there is no 
Split — Step 3 is unnecessary at this time. (FIG. 57) 
3.) Insert(53) at Disk 1 5 
Step 1 

Tlie pattern of locating the proper B-tree- node has been 
well established at this point — see other examples. The 
correct G-node for insertion of 53 is the G-node in B-tree- 
node Dl with Range (50-59). 10 
Step 2 

53 falls in the bottom one-third of the Range (50-59); 
therefore Step 2 is unnecessary (FIG. 58). 
Step 3 

The Insertion of 53 into B-tree-node Dl causes Dl to be 15 
full. Node D3 is already full. Therefore, more than one-half 
of the parallel nodes are full, and we must perform a B-tree 
Split and a G-node Split. This requires accessing the data in 
node D on all three disks. 

Parallel node D contains two G-nodes: one with Range 20 
(50-59), the other with Range (60-78)[Max]. The G-node 
with Range (50-59) contains 8 values; the other G-node 
contains only 6, so the G-node with (50-59) is chosen for the 
Split: the two resulting G-nodes have Ranges (50-54) and 
(55-59) (FIG. 59). The resulting B-tree-node configuration 25 
shows that parallel B-tree-node D contains 3 G-nodes and 
must be split (B-tree Split). The G-node with range (55-59) 
must be re-inserted at the Root-node A. All three Disks 
perform this Step in parallel. Re-insertion of the G-node 
(55-59) causes the Root-node A to Split (FIGS. 61 and 62). 30 

Preferred Embodiment 
Program Adaptation 

The sequential maintenance program to be adapted can be 
made parallel simply by modifying the S-nodes into 35 
P-nodes, grouping the P-nodes into G-nodes (the creation of 
a P-node is done along with the creation of the G-node that 
contains it), and then adding a few functions in addition to 
the original sequential functions. Fullness Rules and Order- 
ing Schemes may be chosen or defined for efficiency. The ^ 
original sequential functions are used to create and maintain 
the data-structure configuration: these functions are simply 
modified to sort, search and arrange according to the rela- 
tionships between G-node Ranges, rather than the relation- 
ships between S-node element values. In the preferred A$ 
instance, G-node Range R(X)<R(Y) if all of the elements x, 
in Range R(X) are less than all elements y ; - in G-node Range 
R(Y): this establishes the relationships between G-nodes in 
the adapted data-structures. The method of altering algo- 
rithms is generally to replace comparisons between x and y 5Q 
in the sequential algorithms with comparisons between R(X) 
and R(Y) for the parallelized functions. 
Function List: 

1. Create-G-node (element y) 

2. Find-G-node (element y) 55 

3. Search-G-node(G-node v, element y) 

4. Add-to-G-node(G-node v, element y) 

5. Split-G-node(G-node v) 

6. Semi-sort-G-node(G-node v) go 

7. Adjust-G-node-Ranges(G-node v) 

8. Insert-G-node(G-node v) 

9. Remove-G-node(G-node v) 

10. Resolve-Range-conflict(G-node u, G-node v) 

11. Remove-from-G-node(G-node v, element y) 65 
Some of the functions listed above (2, 8 and 9) call the 

slightly modified sequential functions for a given data- 



26 

structure. "G-node u,v;" and "element y, z, . . . etc. are 
variable declarations or parameters. 

Preferred Embodiment Program Adaptation 
Function Explanations: 

1. Create-G-node(element y). This function creates a 
G-node by creating one P-node per-processor in the 
same per-processor location in the data-structure. It 
places the element y in the P-node of the processor 
chosen to hold y. Any or all processors may place their 
own elements y,- in their own P-nodes as well. This 
function defines the G-node Range: because the new 
G-node will generally be in an undefined state, the 
G-node Range may be partially or fully undefined; this 
is represented in most cases, by the use of MAXVAL 
and/or MINVAL. If the G-node is the first created in the 
structure, its Range will generally be R(X)= 
{MINVAL, MAXVAL} for ordinal data. This function 
works cooperatively with the other processors. Because 
G-nodes are composed of P-nodes, this function is a 
parallel node creation function as well as a global node 
creation function. 

2. Find-G-node(element y). The find global node function 
is a searching function that locates a G-node with a 
G-node Range into which the element y falls; this 
function can provide individual access to each separate 
graph or data-structure, locating a G-node Range with- 
out involving the entire global data-structure. Sequen- 
tial data-structures that already have SearchO functions 
need only modify those functions to work with G-node 
Ranges as opposed to element values (using the range 
function R(G-node)). For sequential data-structures 
that normally have no SearchO functions, knowledge of 
the sequential data-structure must be used to create a 
proper Find-G-node0 function; in such cases, the 
G-node found may be one of many possible G-nodes if 
the Ranges overlap. This function returns the G-node 
location found. After the G-node location is found and 
returned, this function may be combined with the 
Search-G-node() function to provide parallel access to 
the parallel data-structure. 

3. Search-G-node(G-node v, element y). This function 
searches the G-node v cooperatively for the element y 
as a parallel access function. G-node v obviously must 
have a Range capable of holding y. This function may 
be initiated by a given processor i and then have the 
other processors return the results of their search to the 
processor i; thus any one processor may search the 
entire parallel data-structure for an element y by (1) 
locating the proper G-node at will in its own separate 
graph and (2) performing a Search-G-node0 in coop- 
eration if necessary thus accessing all of the separate * _ 
graphs together as parallel data-structure. \^TL~1 J 

4 . Add-to-G-node(G-node v, element y). The add to globa]_ ^ 
node function is cabled after the appropriate G-node for 

elem ent y has been located. This function inserts the 
element v into G-node v. This function may arrange the 
G- node elements in any way desirable for a gi ven 
data-structure according to Rules for Fullness or Or der- 
jngScheme, or this function may si mp ly place elemen t 
an empty cell id the ^-nodeoithe requesting 
processor that is part oi: U-nodc v; if this is not possi ble, 
th en the requesting processor ma y cooperate w i th other 
processor s to tind an empty cell i n which to place y in 
CjL-node v7^ " 

5. Split- G-node(G-node v). {returns new G-node} This 
function calls functions 6 and 7. This function is called 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 



27 



28 



10 



15 



when G-node v is full. The first step is to call function 
(6) Semi-sort-G-nodeQ which arranges the elements in 
G-node v such that they are split into two sets X,Y 
(XUY-W); the resulting sets are partially sorted such 
that every element x t - falls into a G-node Range distinct 
from the Range containing all elements y,-. Without loss 
of generality, we assume unique ordinal elements, an 
ordinal relationship of "less- than," and the preferred 
method of Range calculation for the data-structure: thus 
every element x ( - is less than every element y ( ; the set 
X is contained in cells v £01 , the set Y in v lo2 (i taking 
on the values 1 =i=P). The second step is to create new 
P-nodes at all processors and move set X or Y into the 
new P-nodes at each processor i. The third step is to call 
function (7) Adjust-G-node-RangesO which resets the 
G-node Ranges according to the new distribution of 
elements and creates a new Range for the new G-node. 
This function (Split-G-node()) may be called on a 
defined or undefined G-node; after the function ends 20 
there will be two G-nodes, one of which will usually 
remain where it was in the data -structure, the other 
must be reinserted by function (8) Insert-G-node() or 
placed appropriately. Generally, if the original G-node 
v was a defined G-node, then both resulting nodes will 25 
be defined; if not, then at least one of the resulting 
G-nodes will be partially defined. The defined G-node 
is reinserted (for example see (7) Adjust- G-node 
Ranges©- 3o 

6. Semi-sort-G-node(G-node v). As explained above, this 
function divides or partially sorts the elements in 
G-node v and places the resulting distinct sets into the 
proper processors. This function sub -divides and dis- 
tributes the portion of data defined by the G-node 35 
Range, in essence creating new ranges. The function 
may also send the minimum and maximum values of 
the two sets to each processor (or other information for 
the calculation of Ranges, Fullness, Ordering, etc.). 

7. Adj us t-G-node- Ranges(G- node v). The Adjust-G-np de- 40 
RangesQ function is Key~~ to the adaptation proces s, 
performing range determination to group data, into 
value ranges: th is function is a r ange addition an d 
removal function that works in combination with the 
insert G-node and remove G-n ode functi ons. Like t he 
Find-G-nodeQ function it depends on the c onfigurat ion 
and rules of the s equentia l data-structure being adap ted. 
bxamples of Split-G-nodeQ and Adjust-G-nod e- 
RangesQ are given together because they are so close ly 50 
related. T here are different ways of adjusting Ranges 
for different data-structures. Also, the G-node v isnot 
the only G-node which will have its Range adjust ed; 
there may be adjustments on any nodes which ha ve 
their R anges w'll'Bl l fr 07 parffal i ^iependent on , G-nod e 55 
yJTje Tgnfll as maintaip t\\e. p]| e which governs t he 

relat ionships between noda l values by adjusting the 
R anges to fit the newj ?lacement of the elements and/or 
"G^hode^). J he Adjust-G-node-RangesQ function can 
tiueiaUTsimultaneously but blindly on all processors. 60 
This function may use the minimum and maximum 
values of the elements of the G-nodes in addition to the 
values of old Ranges. When adjustments on each 
processor are made blindly, they are depended upon to 
be identical over all processors because they use the 65 
same values. Example: split and adjustment made in a 
parallel ordered list with N G-nodes. 



Original List 

Processor 1 
CI 



Dl 



El 



(50-74) 



[ 75, 80 ] 
(75-109) 



DID 

(110-...) 



Processor 2 
C2 



D2 



E2 



(50-74) 



1 "0 I 

(110-...) 



Processor 3 
C3 



D3 



E3 



(50-74) 



[85, 100 j 
(75-109) 



CUD 

(110-...) 



G-node Ranges: R(Q-R(c ol >{50, 74}R(D)=R(d 0l >{75, 
109}R(E)=R(e ol >{110, . . .} 

Step 1: Call Semi-sort-G-node(D) 



Processor 1 
CI 



El 





1 50 ) 




[ 75, 95 ] 




I no I 






(50-74) 




(75-109) 




(110-...) 





Processor 2 
C2 



(50-74) 



[ 80, 90 ] 
(75-109) 



I 120 1 
(110-...) 



Processor 3 
C3 



D3 



E3 



[ 70, 65 ] 
(50-74) 



[85, 100 ] 
(75-109) 



1 130 1 

(110-...) 



45 



Step 2: Split G-node D creating G-node V 

Processor 1 



Cl 



Dl 



(50-74) 



(75-109) 



Processor 2 



D2 



[ 90 



(50-74) 



ULJ 

(75-109) 



Processor 3 
C3 



[ 100 



D3 



El 



I 100 1 
<110-.„) 



E2 



I 120 1 
(110-...) 



[ 70, 65 ) 
(50-74) 



(75-109) 



DID 

(110-...) 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 



29 



Step 3: Re-insert V (Insert-G-node(V)) 

In this circumstance (ordered list) the re- insert is predict- 
able and obvious. Step 4 will adjust all G-node Ranges at the 
same time. 
Graph for Step 3 

(Note that the Range for G-node V is depicted although it 
is not calculated until step 4.) 



30 



G-node v's range by merging the range of G-node V into the 
range of G-node D. 



8. Insert-G-node(G-node v). This function works the same 
as the InsertQ function for the sequential data-structure 
except that it uses the values of G-node Ranges to 



Processor 1 



Processor 2 



Processor 3 
C3 



D3 



V3 



CI 


Dl 


VI 


El 






I 50 ) 




f * ) 




( 95 ] 




1 «° 1 




(50-74) 




(75-89) 




(90-109) 




(110-...) 





El 



C2 


D2 


V2 


El 






r « i 




1 80 ) 




1 90 | 




( 120 ) 




(50-74) 




(75^89) 




(90-109) 




(110-...) 







[ 70, 65 J 




1 « ] 




I ioo J 




1 130 | 






(50-74) 




(75-89) 




(90-109) 




(110-...) 





Step 4: Set new G-node Ranges 

Under other circumstances, the new Ranges for D would 
be set as well as for V; after which, insertion of V would take 
place and a new resetting of Ranges done for the insert; here 
all Ranges are set at Step 4 because the placement and 
Range -setting are obvious. However, the pattern should be 
clear: 

Ranges: 

R(Q: unchanged— {50,74}. R(Q's second value R(C 012 ) 
is still based on R(d on )which is unchanged. 

R(D): R(d 012 ) is changed: {75,89}, R(d oia )-R(v ou ) 
-1-90-1-89. 

R(V): {90,109} 

R(v 0ll )=minimum value of V=90 
R(v. 12 )=R(e oll )-l=110-l=109 
R(E) : unchanged. 

For this parallel ordered list data-structure, the formula for 
G-node Ranges R(X) (X taking on values 2<X<(N-1) where 
N= number of G-nodes) is 



R(X)- { 

R(xq U ) ■ minimum element of X, 
' R(nbiz) - R( (X + l)ou - 1} 



30 



35 



45 



50 



55 



The formulas for other data-structures, though more com- 
plex in general, are very similar. At the extreme ends of the 
spectrum for the parallel ordered list we have special values: 

For N G-nodes we have: 

R(a 0ll )=MINVAL 

R(n ol2 )=MAXVAL 
The G-nodes A and N are partially defined G-nodes. 

Hie above example shows G-node addition, adding one 
new G-node and one new range to the parallel data -structure. 
If the new G-node v were then removed, the process would 
simply be reversed, removing G-node v and removing 



arrange and relate the G-nodes rather than element 
values to arrange S-nodes (using the range function 
R(G-node)). 

Adding a new node to a sequential data-structure gener- 
ally requires reconfiguring the links to represent changes in 
the logical relationships. Each decision (e.g. IF statement) in 
the sequential algorithm can be modified to use the range 
function R() to produce or adjust the proper relationships 
and position the nodes within the order of the data-structure 
by range. 

All processors may perform this function simultaneously 
and blindly; however, in the event that two G-nodes with 
overlapping Ranges collide, the Resolve -Range-conflict() 
function may be called; Resolve -Range -conflictO is coop- 
erative. In most respects, this function, Insert-G-nodeO, is 
identical to the sequential InsertQ. 

9. Remove-G-node(G-node v). All of the statements made 
about function (8.) Insert-G-node() also apply to this 
function with respect to the sequential RemoveO func- 
tion. In most respects, this function is identical to the 
sequential RemoveO* 

10. Resolve -range- conflict (G-node u,v). This function 
resolves the problem of overlapping G-node Ranges. A 
difficulty presents itself if a data-structure creates over- 
lapping Ranges because of non-contiguous data place- 
ment. Two G-nodes may try to occupy the same or 
adjacent positions in the data-structure. If two such 
G-nodes conflict, then the elements in the G-nodes 
must be divided between them in such a way that the 
new ranges calculated for an element arrangement do 
not overlap. This function may determine ranges and 
force re-distribution of the element values or it may 
semi-sort the elements across the nodes u and v forcing 
re-determination of ranges based on the semi-sort. 

11. Remove-from-G-node(G-node v, element y). The 
remove from global node function is called after the 
appropriate G-node for element y has been located. 
This function removes the element y from G-node v. 
This function may arrange the G-node elements in any 
way desirable for a given data-structure according to 



04/01/2003, EAST Version: 1.03.0002 



6,i: 

31 

Rules for Fullness or Ordering Scheme, or this function 
may simply remove element y from the P-node that 
contains it. 

Preferred Embodiment 
Generalized Parallel Method 

The following is a generalized parallel method which uses 
the previously defined functions to create the parallel data- 
structures. The configurations of the data-structures in ques- 
tion are determined by the Insert0 and RemoveQ functions 
of the sequential data-structures that have been modified 
slightly to use G-node Ranges to establish ordinable rela- 
tionships. The slightly modified functions are called from 
within functions Insert-G-node() and Remove-G-nodeQ. 
The indications of steps 1, 2 and 3 to the left of the 
pseudo-code are the steps explained in the Verbal Descrip- 
tion section. 

Preferred Embodiment 
Function Parallel-Insert (element y): 

This function is called by any processor wishing to insert 
the element y into the parallel data-structure. It is assumed 
that the first G-node of the data-structure has already been 
created. 



Parallel- Insert (element y) 
G-node u,v 

Step 1 -• v = Find-G-nodc(y) 
Step 2 -»> if (there is an empty cell in P-node v) 
then 

place y in F-node v 
else 

Add-to -G-nodefvj) 
end if 

Step 3 -* if (G-node v is full) 
then 

u » Split-G-node(v) 
Insert-G-node(u) 
Adjust-G-node-ranges(u) 
end if 

END FUNCTION 



Preferred Embodiment 
Function Parallel-Remove (element y) 

This function finds and removes a specific value y. Some 
data-structures remove elements by location (example: 
priority-queue); in such cases, the Find-G-nodeQ function 
may be adapted to find the proper location, and then the 
G-node may be sorted or searched for the appropriate value. 



18,123 

32 

encapsulated in the sequential InsertQ and RemoveQ 
functions, or that they can be adapted and used in the same 
manner as those functions. The Parallel-InsertO and 
Parallel- RemoveO functions describe the formation and 
5 functioning of the parallel data-structures in their important 
aspects. 

Preferred Embodiment Program Adaptation 

10 Find Function 

This section contains a small section of C like Pseudo- 
code intended to illustrate the simplicity of adapting single 
processor functions to multiple processor functions. The 
code is not intended to be in perfect C syntax, only to give 

1 5 the basic concepts that could be used for creating the Find 
function in parallel form. Nor is the code intended to be the 
only possible embodiment of the Find function (Find-G- 
node). This example should also help to illustrate the nature 
of the "slightly adapted" single processor Insert and Remove 

20 functions mentioned previously because the majority of the 
work done for those functions is the location of the proper 
position in the data- structure for an element value. 

The pseudo-code shown is a Find function for a binary 

^ search tree; the concepts expressed are given to be useable 
for other data-structures as well. The pseudo-code shows 
that the primary difference between single and multiple 
processor functions is the replacing of the comparisons of 
element values with comparisons of G-node-Ranges: it is 

3Q easiest to illustrate this by taking advantage of operator 
overloading with regard to the <,>, and == operators — 
these operators are assumed to work equally well on element 
values and G-node-Ranges. 
Single Processer Definitions and Pseudo-code 

35 



struct nodc_st{ 

key_type key; 
node_st *leftchild; 
node_st "rightchild; 

} 

nodc_st *Find(node_st *node, key_type element) 
{ 

if (node->key element) 

Tcturn(node); 
if (node->key < element) 

Teturn(Find(node->rightchild,element)); 
else 

ieturn(Find(node->leftchild > element)); 



Parallel-Removefelementv^ 

G-node v; 
Step 1 -+ v « Find-G-node(y) 
Step 2 -* if (G-node v is found) 

then 

Search-G-node(v,y) 

if (y is found in G-node v) 

then 

Remove y from cell and 
send to proper processor 
end- if 
end- if 

Step 3 -* if (G-node v is empty) 
then 

Remove-G-node(v) 
Adjust- G-node- Ranges (u) 
end- if 
END FUNCTION 



Multiple Processer Definitions and Pseudo-code 



x = Maximum number of elements per P-node; 
struct Range { 

key_type lowerbound; 

key_type upperbound; 

} 

struct Pnode_st{ 

node number int; 

key_type key[X]; 
Pnode_st 'leftchild; 
Pnode_st *rightchild; 
Gnode_Range Range; 

} 

Pnode_st *PFmd(Pnode_st 'node, key_typc element) 



It is assumed in the preferred instance that most sequential 
functions for the adaptable sequential data -structures can be 



{ 



if (nodc->Gnode Range » element) 

return(node); 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 

33 34 

storing or calculating the location of the memory or disk 
-continued space allocated for the different portions of data-structure 
storage: one such method would be keeping an index to a 

" ^^^l^^L^y, s '°«f « * is al l° cat «> f ° r "° d " °? "ch device; 

e l se 5 another would be the storage or definition of explicit point- 

rtturn(PFind(Dode->leftchild J clemcnt)); ers in each P-node to the other P-nodes within a given 

) G-node; for disk drives, many indexing techniques already 

exist to perform functions very similar to this. Some sucb 
techniques are described in "Operating System Concepts" 

Preferred Embodiment io by Silberschantz and Galvin, Addison-Wesley Publishing, 

Data Model 1994 (fourth edition). 

FIG. 63 depicts a possible data model for an embodiment The amount of data stored on the B- trees in these 

of the present invention. Each box is a data entity expressing examples is small. It is not reflective of the size of B-trees 

an aspect of the program design for the embodiment. Each on real systems. In addition, the nodes themselves are 

data entity is a set of rules, data-structure, record, other set is relatively small, and the methods of storing data from 

of data and/or associated maintenance functions used to node-to-node could make use of additional techniques to 

create the parallel maintenance system. No particular mod- improve efficiency. Commonly known techniques for single - 

eling technique is implied. processor B-trees include the techniques used for B* trees 

The data model shown is only one possible model, given and B+ trees. Also, the use of overflow blocks and tech- 

to express the parallel system components from a data model 20 niques derived from B* and B+ trees could be added to the 

perspective. The relationships are described below. examples given here. *\ 

A.1— indicates a set of range adjustment rules may com- 0ther Embodiments Q? <* 5 ) 

prise or relate to multiple sets of range addition rules; Definilion of G -nodes for the B+ Tree 

A.2— a set of range adjustment rules may comprise or ^ variation on the B-tree is the B+ tree. The follow ing 

relate to multiple sets of range removal rules; 25 d escribes one embodiment of the parallel B+ tree. The fi le 

A.3 — a set of range adjustment rules may comprise or s tructures book referenced in this application defines a B + 

relate to multiple sets of range breadth adjustment ^ ac « ^Tfif fvr which data- filerecord pointe rs are sto red 

rules; on ly at the leaves of the tree. This indic at es that th e 

D.l— a set of range determination rules may comprise or 30 d efinit ion of t he j B^en odes in a B+ tree ta kes two forms : 

relate to multiple sets of range adjustment rules; one'Torm fo r t he non-leaf nodes an d one form for the leaf 

G.l— a set of adjustment need rules applies to many n5de's The same may be done for the definition of G-nocbs 

G-nodes* aTIcTEheir Ranges for parallelized data-structures. I use the 

G.2-a set'of range determination rules applies to many riaranei B + tree to illustrate tms concern^ ^ 

G-nodes* 35 ^^ cause tDe elements (tuples) storeaTrrthe B+ tree only 

' . , - , „ , contain data-file record pointers at the leaf-nodes of the tree, 

tkdoShi" * ° ne ' t0 ' 0DC thc G-nodc Ranges in the non-leaf nodes do not require the 

storage of actual tuples containing record pointers. This 

G.4— a logical relationship may relate many G-nodes, and means that thc only uscful mformation in the non-leaf nodes 

a G-node may have many logical relationships; ^ the storage of G . node Rangcs: the Rangcs arc used to 

G.5 — a G-node contains many P-nodes; locate the desired leaf-nodes. B+ tuples are never inserted 

G.6 — a set of arranging rules applies to many G-nodes; into non-leaf nodes and therefore the parallel Ranges need 

P.l — a P-node contains many data value storage entities not be defined to contain values. Single values may be stored 

or elements; m the non-leaf nodes to represent non-leaf Ranges (the 

R.l— a set of range relation rules applies to many ranges; 45 ™£™ ^ ^ e of thc Range may equal the maximum value 

R.2 — a set of range relation rules applies to many logical 0 ^ e , ^ j c „ , . , , ~ . 

relationships* nodes of the B+ tree have G-nodes and G-node 

Notes on B-tree Section RangCS defined in the manner dcscribcd ' m Previous sec- 

The two rules (Ordering Scheme and Rule for Fullness) tions ' * n *f rti , ons of f ^ ^° a?i ^t'l 

used in this section are not the only possible rules for storing so 0C ? Ut at th K C Un f °* node «P|^ All non-leaf Range 

o "q nr ntU „ A nin it- « * values are based on the values contained in the leaf -nodes of 

a parallel B-tree or other data-structure on multi-component ^ 

Dynamic Access Storage Devices such as disk drives. Many e rce * 

other Fullness and Ordering rules may be used (defined), but Other Embodiments 

the essential pattern of the present method remains the same. Complex Ranges 

Well known methods exist for storing information on the 55 More complex range calculations than those described in 

locations of data storage blocks. These can be used to store other sections are possible and justifiable. For example, an 

information on the fullness of the parallel B-tree-nodes. A additional embodiment of an adapted AVL tree may be 

bit -map stored in memory would suffice, as would a bit-map created by the use of a different set of range relation rules or 

stored in high-speed secondary storage (e.g. a faster, more range determination rules. The AVL tree previously 

expensive disk drive than the others used). Many other 60 described herein used range relation rules defined in a linear 

possibilities exist for the storage of the information; the only contiguous fashion: R(A alo ) <R(B o1o ) if and only if R(A OJ2 ) 

requirement is that it allows the determination of the fullness <R(B o11 ) (i.e. Max of A less than Min of B); this produced 

of B-tree-nodes and G-nodes without accessing every drive. a distribution of the total data set such that the possible 

It should also be noted that the preferred embodiment and storage of a given value x on a processor was only deter- 

the data-structures and maintenance routines that result from 65 minable by locating its G-node Range, 

it function by sending the locations of data-structure nodes Imagine instead a range function such that the highest 

between processing elements. This may require a method of order digit is ignored. Thus, a range (#50-#70) could contain 



04/01/2003, EAST Version: 1.03,0002 



6,138,123 



35 



36 



25 



values 150,250,350,165,266,360,370, etc. In addition imag- 
ine an Ordering Scheme such that processor 1 contains only 
values whose first digit is 1, processor 2 only values whose 
first digit is 2, etc. This combination of range function R() 
and Ordering Scheme create a parallel structure such that a 5 
given value is known to be stored on a given processor i or 
not at all before search begins (e.g. the value 563 will be 
found on processor 5 or not at all, the value 828 on processor 
8 or not at all, etc.) If leading zeros are assumed, then the 
combination also creates a data structure composed of ten 
separate structures, each having its own range of possible 
values (i.e. (00O-O99),(10(M99),(2OO-299) ) etc.). 

An advantage gained by the grouping of elements into 
ranges as described above while simultaneously grouping 
the elements by G-node Ranges is that the elements are 
sorted by high order digits and sub-sorted by low order digits 
and simultaneously sorted by low order digits and sub-sorted 
by high order digits. 

Such sorting may even be useful if the P-nodes related by 
high order digits are grouped and contained on a single 2Q 
graph, rather than the multiple graphs described in other 
embodiments herein. 

Such complex range calculation as described above shows 
a more advanced grouping of elements by range than other 
embodiments described herein. The elements contained in 
the data structure are organized in two fashions: by high 
order digits and low order digits. This grouping illustrates an 
element's or a P-node's membership in multiple complex 
sets. Another instance (a refinement or improvement of the 
concept of membership in multiple sets) could provide a 3Q 
P-node membership in a plurality of sets, each set organized 
for access by different aspects of the data stored (e.g. last, 
first and middle name, etc.). 

FIG. 65 shows nine P-nodes, all related by commonality 
of complex G-node ranges: the nine P-nodes are all part of 35 
a complex G-node. The ranges may of course be stored 
implicitly and partially calculated by processor number; 
however, on the diagram, they are explicitly listed. Pound 
signs indicate wildcards; numeric entries separated by 
dashes indicate ranges; the first two entries may be com- 
bined to form an ordinal range and then further refined by 
adding the last entry: therefore processor 5, having complex 
G-node Range (#50-#70,2##,##4-##6) may contain num- 
bers between 250 and 270 whose last digit is between 4 and 
6. If the nine processors depicted are on a two dimensional 45 
mesh of processors, then each linear array may be accessed 
according to the common key attribute being sought by a 
user or system process (e.g. any key being sought between 
100 and 199 will be found on processor 1, 4 or 7). The rules 
for insert (i.e. range relation rules) for the data structure in 5Q 
FIG. 65 are assumed to apply to the "#50-#70" portion of 
the complex range: that is, the links are configured by that 
portion of the complex range such that if x>y then R(#x) 
>R(#y). FIG. 65 represents a parallel data-structure consid- 
ered to have two dimensions at the processor or storage 
level; the possibility of more dimensions is implied. 

The great variety of combinations offers a wide range 
possible uses according to the needs of a given system or 
data-structure. 

Other Embodiments 60 

Dependant Ranges 

Imagine a decision-tree used, for instance, to play chess. 
A given function can identify when a piece on the board is 
threatened by an opposing piece. This increases the priority 
of moving that piece. A given node in the tree representing 65 
this situation on the board will have a wider range of 
possible moves than nodes dependant on the given node. 



55 



Each possible movement of the threatened piece is within 
the range defined by its identity as a move of the piece and 
its dependency on other nodes. This range may be distrib- 
uted to multiple processing elements according to range 
determination, Rules for Fullness and Ordering Schemes 
defined for the decision tree algorithm's use in a parallel 
environment. In this instance, the values and ranges for the 
nodes are created together, rather than input and inserted or 
removed. 



Addendum 



G-node Ranges 



The calculation of G-node-Ranges is key to the entire 
process. Generally, it is simply a matter of determining 
which nodes on a data-structure most closely determine the 
positions of the other nodes; that is to say which nodes 
contain values that determine the values that may be con- 
tained in a given node. Partially defined G-node Ranges m ay 
frequently be found at the extreme points of the d ata- 
structure ; for instance r the ront and l eaves of a heap, o r t he 
r ight most and left most node of a binary search tre e, 2-3 
! uee, or B-tree. 



Addendum 
Ordinal vs. Ordinable Data 

Most of the Examples for this application are given for 
ordinal data types. However, any data-structure having the 
capacity of the data values to be grouped into suitable 
G-node Ranges will be adaptable. If the G-node Ranges can 
be constructed such that the nodes which branch off from the 
members of the Range can be said to relate to all of those 
members in a similar fashion, then the parallel or global 
links between nodes are justified and will be consistent with 
the data-structure and/or method rules. Such data-structures 
and/or methods are adaptable by this process. 

Addendum 
Use of Space/Merging G-nodes 

Because the G-node -RemoveO function is only used oo 
sufficiently empty G-nodes, it is possible to have large 
data-structures with large numbers of partially empty 
P-nodes; however, the present method is capable of adjust- 
ment to make efficient use of space. A G-node-Merge0 
function to merge two sparsely populated G-nodes into one 
would be one way to resolve this problem; another would be 
to alter the Rule for Fullness, changing the lower limit on the 
number of elements in a G-node to half P and remove 
P-nodes that break the rule, reinserting their elements. 

Addendum 
Contiguity of Data Distribution 

Non-contiguous data distribution like that of a heap 
makes difficult the efficient search, and therefore efficient 
placement, of elements into unique G-node Ranges. One 
solution to this is the Resolve-Range-ConflictO function; 
however, for those data-structures that can be forced into a 
contiguous configuration and thereby make efficient search- 
ing possible, this function may not be necessary. Non- 
contiguous methods of defining ranges or distributing values 
may also be defined and used for the present invention. 

Addendum 
Data -Structures/Methods 

The data-structures and methods listed in this application 
are only examples of adaptations from sequential to parallel. 
Many other data-structures and methods not listed can be 
adapted through this process. No restriction on types of data 
stored ormannerof storage is implied. Many distributed or 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 

37 38 

parallel data-structures may be created in accordance with them among the disk-packs one-at-a-time. However, disk 

the principles of the present invention. The present invention access is much slower than memory access, so the queuing 

may also be used to create new data-structures without serial and distribution of the requests in the server's memory 

counter-parts. might take 10 microseconds per request. Therefore, the last 

5 request to disk would be made 30 microseconds after the 

Examples of Application of Preferred Embodiment user made the request. If we assume that each key is stored 

on the fourth level down in the B-tree, and we assume that 

The following two examples illustrate the functioning of each disk access requires 10 miUisecondS) then the last 

two adapted parallel data-structures on a working system. request for a key ^ Mmkd m milliseconds +30 m i C rosec- 

Two examples are given to show that the parallel routines Qnds after ^ request Tf ^ B _ tfee wefe siQTe6 OQ a siQgle 

and data-structures may function on a variety of different disk> then in the WOfst case> each kcy requesl would ^ lfJ 

systems. Specially, one Example is stored in memory on milIiseconds x4 tree levels x3 user-requests =120 millisec- 

a parallel-processing hypercube network, and the other is Qnds + queum g 

stored on disk. Although the data-structures can be used by ^ ^ make _ u of ^ twQ ^ descdbed abov£ 

any program, whether batch or on-line, the examples illus- diffe bm m ^ c ^ ^ WQfk fc successfull distributed 

trate the functioning of flic data-structures by assuming a t processmg . elementS( givmg better response time, 

multiple users accessing the same system simultaneously. ^ ^ sug g ested aod precise make . up of the systems 

Example of Application 1 depicted are given only for the purpose of example. The 

times given are estimates and the calculations are simple 
FIG. 66 shows a parallel machine with 128 processors 2Q illustrations of the functioning of the types of systems that 
connected by a hypercube network. A powerful machine ccm ld make use of the Adapted data-structures, 
such as this could serve a great number of users Conclusion, Ramifications and Scope 
simultaneously, but only three are depicted. Each of the Thus the reader can see the results of combining the 
terminals (numbered 1, 14, and 127) have allocated their various aspects of this method of creating and using parallel 
respective processors 1, 14, and 127, and are conducting ^ data-structures. The present invention provides a great van- 
on-line accesses to a file located on disk in a hashed file with e ty of possible combinations of rules for fullness of nodes, 
secondary keys stored in an Adapted parallel data-structure: range determination, parallel and global node definition, and 
the keys are stored in the memories of the various processors d a ta distribution such that each aspect of the invention, in 
distributed throughout the hypercube on a parallel m-way addition to others not listed, may be used in combination 
search tree. Each processor has 16 Mega-bytes of memory. 3Q w i t h one or more of the others, or alone, to enhance 
Each processor stores approximately Vhsth of the file's keys performance of the parallel data structures and define new 
in its local memory. If we assume that the search-tree can data-structures, including parallel forms of serial data- 
store each key using 20 bytes of memory (including structures and many others. 

pointers, indexes, etc.), and we also assume that each Th e combinations of components in the embodiments 

processor uses a maximum of approximately 1 Mega-byte of 35 herein are not the only combinations possible. Not only are 

RAM, then approximately 50,000 keys may be stored on different combinations possible, but different instances of 

each processor: 50,000x128-6,400,000 keys may be stored tn e components themselves, such differences exemplified by 

in parallel memory. The same tree stored in a single pro- me various rules for fullness, ordering schemes, and range 

cessor's memory would require 128 Mega-bytes of memory calculations described and contrasted in this application, 

and probably force the storage of the tree onto disk. In ^ though not limited to those descriptions or those compo- 

addition, each user on the system may search the tree nents. 

simultaneously: little or no queuing will result from simul- while my description above contains many specifics, 

taneous accesses to the tree, unless more than 128 users are tnese should not be construed as limitations on the invention, 

logged on. but rather as an exemplification of preferred embodiments 

(Note that FIG. 38 used for this example was designed for 45 thereof. Many other variations are possible. Accordingly, the 

3 processors in the General Example: the key ranges and size scope of the invention should not be limited to the embodi- 

of the tree are accordingly small, and the processor numbers ments illustrated; the scope of the invention should be 

illustrated are different.) If we assume that the processors 1, determined by the appended claims and their legal equiva- 

14, and 127 contain the search -tree nodes depicted in FIG. lents. 

38, then User 1 could request key 56, User 14 could request 50 I claim: 

key 15, and User 127 could request key 10 simultaneously. 1. A method of maintaining order for data on a computer 

Each processor would then access two nodes of its own system by creating a parallel data -structure, said data stored 

locally stored tree to reach the bottom level, send requests on one or more memory storage means, accessed by one or 

for keys to other processors as necessary, and receive replies. more processing elements, said order represented either 

The same values stored on a single -processor tree would ss explicitly or implicitly as a graph or graphs containing nodes 

require more accesses (a taller tree) and queuing. In either that represent sets of data values grouped into ranges and 

case the disk could be accessed after retrieval of the keys incident links that represent logical relationships between 

from the search tree, and the users would receive the said sets of data values, the nodes and links either explicitly 

appropriate records from disk. or implicitly stored on said memory storage means, said 

60 memory storage means and said order maintained by said 

Example of Application 2 processing elements, 

FIG. 67 shows three terminals connected to a server, the i. wherein said memory storage means is divided into 

server is connected to three disk-packs. A parallel B-tree logically corresponding storage units or partitions, said 

distributed amongst the three disk-packs can be accessed partitions defined by a parallel storage location of said 

simultaneously by each user. If Users 1, 2 and 3 all make 65 nodes on said memory storage means, and 

requests to access the B-tree index at the same time, then the ii. wherein one form of said logical relationship is a serial 

server would have to queue these requests and distribute or local relationship relating two Or more differing said 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 



39 



40 



ranges to each other according to their differences, said 
local relationships relating said ranges within said 
partition of said memory storage means, and 

iii. wherein a second form of said logical relationship is 

a global relationship relating two or more similar said 5 
ranges to each other according to their similarity or 
commonality, said global relationships relating said 
ranges between multiple said partitions, and 

iv. wherein said nodes form individual nodes having said 
local relationships with other said individual nodes 10 
with different said ranges, said individual nodes also 
referred to as parallel nodes, and 

v. wherein said nodes form global nodes comprising 
multiple said individual nodes having said global rela- 
tionships with other said individual nodes, said global 15 
nodes thereby comprising multiple said individual 
nodes with the similar or common said ranges, each 
said individual node within a given said global node 
having the common range, such that said global node is 

a composite of said individual nodes, 20 
the method comprising the steps of: 

a. determining said ranges for said sets of data values, 

b. assigning said ranges to said nodes and assigning 
said sets of data values to said nodes by determining 
the ranges into which they fall, 25 

c. positioning said individual nodes within said order 
using said links by determining said local relation- 
ships and said global relationships between said 
ranges, 

d. storing said nodes in different portions of said 30 
memory storage means such that said ranges with 
said commonality are stored on multiple said indi- 
vidual nodes, each said individual node with the 
common said range stored in a different said portion 
thereby defining said partitions and said global nodes 35 
comprising a plurality of said individual nodes, and 
thereby storing said local relationships as explicit or 
implicit said links within said partition and said 
global relationships across multiple said partitions, 

whereby a combination of the local and global relationships 40 
creates a composite global data-structure comprising mul- 
tiple serial or local data-structures, and whereby said data is 
maintained in said order on each of said partitions in a 
uniform manner and on all of said memory storage means 
combined, and a plurality of system processes are enabled to 45 
access the data values simultaneously by accessing said 
individual nodes in a given said partition of choice, thus 
gaining access to said global node having desired said range 
and to the global data-structure as a whole. 

2. A method as recited in claim 1 wherein said order is so 
expressed as a plurality of separate said graphs, each said 
graph stored separately within said memory storage means, 
each said graph arranged by arranging rules of an adapted 
sequential data-structure, thus creating said parallel data- 
structure capable of the same functions as said sequential 55 
data-structure in a distributed environment. 

3. A method as recited in claim 2 wherein each said 
individual node contained in a given said global node has an 
identical said range to all other said individual nodes in the 
given global node and wherein all the logical relationships 60 
between all said individual nodes belonging to the given 
global node and all said individual nodes belonging to 
another said global node are identical. 

4. A method as recited in claim 2 wherein said memory 
storage means is composed of a plurality of disks, and said 65 
order is denned by a set of rules for maintaining a serial 
b-tree as adapted to function using said ranges in a parallel 



environment, thus creating a plurality of b-trees located on 
said disks, each said b-tree represented as a separate said 
graph composed of said individual nodes, every said indi- 
vidual node contained on a given said portion of said disks 
belonging to a different said global node from all other said 
individual nodes contained on the given portion of said 
disks. 

5. A method as recited in claim 2 wherein the plurality of 
separate said graphs is created and maintained by the present 
method and each of said separate graphs has an identical 
structure as every other said separate graph, said structure 
defined by an identical positioning of each said individual 
node contained in a given said global node to each other said 
individual node contained in each adjacent said global node, 
that is, each said individual node belonging to the given said 
global node holds the same position within each said sepa- 
rate graph as every other said individual node belonging to 
the given said global node holds in its said separate graph, 
whereby each said separate graph is identical in form and 
function to each other said separate graph and is thus able to 
function as a separate data-structure on a single said pro- 
cessing element and is also able to be combined with the 
other said separate graphs and function as said parallel 
data-structure on multiple said processing elements. 

6. A method as recited in claim 1 further employing the 
steps of: 

a. identifying where said range is too broad for a given 
said global node thereby indicating a need to split said 
range, 

b. upon the indication of need to split said range, splitting 
said range by performing the range determination on 
the range being split and adjusting adjacent said ranges 
as necessary thereby creating at least one new said 
range, assigning the new range or ranges to a new said 
global node or nodes and performing the positioning to 
position the new nodes and existing nodes as necessary 
using said links thereby adding the new nodes to said 
order and maintaining said order, 

c. identifying where said range is too narrow for a given 
said global node thereby indicating a need to broaden 
said range, 

d. upon the indication of need to broaden said range, 
performing the range determination to adjust said 
ranges for adjacent said ranges as necessary, removing 
the global node containing the too narrow range if 
necessary, and performing the positioning as necessary 
to reconfigure said links for remaining said nodes 
thereby removing appropriate said nodes from said 
order and maintaining said order, 

e. upon the indication that said range or ranges are too 
broad or too narrow, adjusting said range or ranges by 
performing the range determination thereby adjusting 
said range or ranges to proper breadth, 

whereby said order is manipulated using said ranges, and 
said logical relationships are manipulated as necessary to 
change data storage patterns while maintaining said order of 
said data. 

7. A method as recited in claim 6 wherein the range split 
is performed by creating a new dependent range, said new 
dependent range based on the range being split, at least a 
portion of said new dependent range being beyond the range 
being split thus representing an extension of the range being 
split and narrowing the range being split by combining the 
ranges, 

whereby the range being split is narrowed by combining the 
range being split with said new dependant range, the com- 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 



41 



42 



bination of two ranges representing the combination of two 
restrictions and therefore becoming more restrictive or nar- 
rower. 

8. A method as recited in claim 6 wherein the identifica- 
tion that the range is too broad is achieved by a rule for 5 
fullness of said global nodes that employs a measurement of 
the number and positions of the data values within said 
global node, the identification of the too broad range defined 
by an excess of the data values, said excess indicating that 
said global node is sufficiently full for the range split and 10 
thus for the addition of the new node, and wherein the 
identification that the range is too narrow is performed by 
the measurement of the number and positions of the data 
values within said global node, the identification of the too 
narrow range defined by an insufficiency of the data values, 15 
said insufficiency indicating that said global node is suffi- 
ciently empty for the removal of the node, and further 
employing the steps of: 

a. locating a proper said global node with a proper said 
range to contain desired said data values by traveling 20 
along said links and choosing a path through said links, 
said path determined by using said ranges assigned to 
said global nodes, 

b. upon determination of the proper global node, deter- 
mining proper said individual node within the proper 25 
global node to contain a given said data value, 

c. upon determination of the proper individual node, 
adding or removing said given data value to or from the 
proper individual node thereby adding or removing the 3Q 
given data value to or from the proper global node, 

d. upon the addition or removal of the given data value, 
determining if the proper global node is sufficiently full 
or sufficiently empty, 

e. upon determination that the proper global node is 35 
sufficiently empty for the global node removal, per- 
forming the global node removal wherein the global 
node removal redistributes the data values as necessary 
within their respective said ranges, 

f. upon determination that said global node is sufficiently 40 
full for the global node addition, performing the global 
node addition wherein the global node addition splits 
the range of the sufficiently full global node and redis- 
tributes the data values as necessary within their 
respective said ranges thereby splitting the sufficiently 45 
full global node and adding said new global node or 
nodes to said order. 

9. A method as recited in claim 8 wherein said parallel 
data-structure is adapted from a set of ordering rules of a 
sequential data-structure and wherein said parallel data- 50 
structure is maintained by said processing elements as 
controlled by a parallel maintenance process adapted from a 
sequential algorithm for maintaining said sequential data- 
structure by utilizing the same said ordering rules applied to 
said ranges rather than applied to individual said data values. 55 

10. A method as recited in claim 9 wherein the data 
maintained are ordinal and wherein said ranges defined for 
said global nodes are unique, non overlapping said ranges 
covering the expanse of said data. 

11. A method as recited in claim 10 wherein the method 60 
is used to maintain key values on a distributed database that 
are accessed by said plurality of system processes or a 
plurality of users. 

12. A method as recited in claim 1 wherein said global 
relationships are expressed by a specific said parallel storage 65 
location of said nodes within said memory storage means 
such that a first memory address allocated for a first said 



individual node is used to derive a second memory address 
allocated for a second said individual node within the same 
said global node, 

whereby locating one said individual node within a given 
said global node enables said processing elements to easily 
derive the locations of other said individual nodes within 
said global node. 

13. A method as recited in claim 1 further employing a set 
of rules for arranging the data values within said global 
nodes to provide efficient locating means to locate within 
said memory storage means an exact said individual node 
contained in said global node that could contain a given said 
data value within said range. 

14. A machine to maintain an order for data on a computer 
system containing one or more processing means, one or 
more memory storage means, and communication means 
linking said processing means and said memory storage 
means to form said computer system comprising: 

a. range determination means to group said data into 
ranges, each said range capable of being arranged in a 
sequence or sequences with other said ranges such that 
said range determination means groups said data into 
multiple said ranges and such that said sequences 
between said ranges thereby arrange the data said 
ranges contain, 

b. distribution means to subdivide and distribute each said 
range to subsets that define subdivisions stored on 
multiple parallel or individual nodes on said memory 
storage means, 

c. composite global nodes containing the distribution of a 
given said range, said global node comprising multiple 
said individual nodes, each said individual node storing 
a portion of said range defined by said subdivision, 

d. relation means to define logical relationships between 
said global nodes and logical relationships within said 
global nodes by said ranges, wherein said logical 
relationship between said global nodes is defined by a 
difference between said ranges, such that if said relation 
means compares a given said global node with another 
said global node or their component said individual 
nodes, then said relation means achieves equivalent 
comparison results indicating said difference between 
said ranges, and wherein said logical relationship 
within said global node is defined by a commonality, 
such that component said individual nodes contained 
within said global node all have the same said logical 
relationship indicating said commonality in said range, 
such that said processing means are enabled to arrange 
said individual nodes with each other using said dif- 
ferences and enabled to arrange said individual nodes 
within said global nodes using said commonality, 

whereby said order is expressed by grouping said data into 
said ranges and defining said logical relationships between 
said ranges, and whereby said ranges are able to be distrib- 
uted within said computer system creating an arrangement 
of said data providing the order for said data such that it is 
easily accessed and maintained by multiple system pro- 
cesses. 

15. A machine as recited in claim 14 wherein said 
subdivisions are grouped into separate sets, each said set 
having its own valid said logical relationships between said 
subdivisions and therefore between said ranges, said indi- 
vidual nodes, and said global nodes, each said separate set 
defining a separate graph stored on a division of said 
memory storage means, thereby creating a plurality of said 
separate graphs, each said separate graph expressing said 
order, and all of said separate graphs together expressing 



04/01/2003, EAST Version: 1.03.0002 



6,138,123 



43 



44 



said order thereby creating a parallel data-structure, and 
further including: 

a. individual access means providing an access to an 
individual said separate graph as an individual expres- 
sion of said order, said separate graph accessed as a 5 
valid separate data-structure, and 

b. parallel access means to access multiple said separate 
graphs together as parallel expressions of said order 
wherein the access to the individual said separate graph 
enables access to other said separate graphs, thereby 10 
accessing multiple said separate graphs together as said 
parallel data-structure, 

whereby a plurality of said processing means, system pro- 
cesses or users are enabled to efficiently access said data 
through said separate graphs using separate access paths and ^ 
accessing separate parts of said memory storage means to 
achieve consistent results. 

16. A machine as recited in claim 15 further comprising: 

a. range measurement means to determine if said ranges 
need adjustment providing said range determination 20 
means with cause to regroup said data, 

b. range addition and removal means to add new said 
ranges and remove old said ranges to and from said 
graphs wherein said old ranges are deleted or merged 
with other said ranges, and said new ranges are derived 25 
or split from said old ranges and added in addition to 
said old ranges, and 

wherein said range measurement means determine if the 
ranges must be added or removed, said range addition and 
removal means add or remove said ranges, said distribution 30 
means redistribute the data defined by said ranges as 
necessary, and said relation means reconfigure said separate 
graphs by adjusting the logical relationships between said 
ranges, 

whereby said processing means are enabled to alter a 35 
configuration of said parallel data-structure while maintain- 
ing said order. 

17. A machine as recited in claim 15 wherein said 
distribution means distributes the data to provide said pro- 
cessing means a plurality of dynamically chosen access 40 
paths to a given said subdivision or distributed part of range 
for use by the parallel and individual access means, 
whereby said processing means are enabled to choose freely 
which said separate graph to use for access, accessing a 
chosen said separate graph until a given said distributed part 45 
of range is required, and 

whereby said processing means are enabled to efficiently 
distribute work through the free choice of which said 
separate graph to use for access to said data. 

18. A machine as recited in claim 16 wherein said 50 
individual access means searches a given said separate graph 
for a desired said range thereby identifying a proper said 
global node to contain the desired range whereupon said 
parallel access means locates a proper said subdivision or 
subdivisions within said proper global node, said proper 55 
subdivisions partially or completely containing the desired 
range, the identification of the desired range allowing access 

to desired said data whereupon said processing means uses 
said data and may therefore have need to alter said order, the 
alteration of said order is accomplished by using said range 60 
measurement means and said range addition and removal 
means thereby creating a parallel maintenance program 
executed by said processing means for maintaining said 
parallel data-structure. 

19. A machine as recited in claim 16 wherein the range 65 
addition means divides an existing said range into sub 
ranges thereby creating said new ranges, and said distribu- 



tion means includes an efficient ordering scheme to redis- 
tribute said data contained in the existing range to the new 
range or ranges, thereby creating one or more new said 
global nodes. 

20. A machine as recited in claim 18 wherein said parallel 
maintenance program is adapted from a serial maintenance 
program for maintaining a serial data-structure, said parallel 
maintenance program creating said parallel data-structure 
and utilizing said ranges such that it functions as the adapted 
serial data-structure in a parallel environment. 

21. A machine as recited in claim 14 wherein said memory 
storage means comprises a plurality of memory storage units 
and wherein said individual nodes are distributed among 
said plurality of memory storage units and linked by said 
relation means to form a parallel data -structure. 

22. A machine as recited in claim 21 wherein said 
processing means comprises a plurality of processing 
elements, each said processing element containing a main- 
tenance program for controlling said memory storage 
means, each said processing element able to control one said 
memory storage unit at a time and able to cooperate with 
other said processing elements to control multiple said 
memory storage units using said maintenance program, the 
plurality of maintenance programs thereby combining to 
form a parallel maintenance program, 

whereby said parallel maintenance program controls and 
orders said data through control of said parallel data- 
structure. 

23. A machine as recited in claim 22 wherein said parallel 
maintenance program is adapted from a serial maintenance 
algorithm, said parallel maintenance program functioning 
through the use of said ranges, said ranges used as parallel 
embodiments of the data used in said serial maintenance 
algorithm. 

24. An article of manufacture for a computer system, said 
computer system comprising a memory means and process- 
ing means, said processing means comprising one or more 
processing elements, said processing elements able to access 
said memory means as one or more logically corresponding 
storage locations or memory units, said article controlling an 
ordering of data on said computer system through a parallel 
storage of said data defining a parallel data structure, said 
article comprising: 

a. range determination rules that enable said computer 
system to group said data into sets according to ranges 
of said data, said range determination rules able to 
define multiple said sets with equivalent said ranges, 

b. data storage entity definition rules that enable said 
computer system to define data storage entities that 
contain part of said data as defined by said range, 

c. parallel node definition rules that enable said computer 
system to define parallel nodes, said parallel nodes 
containing one or more said data storage entities, said 
parallel nodes defined by said ranges indicating the data 
values that said parallel node is able to contain, 

d. composite global node definition rules that define 
global nodes as composites of said parallel nodes, said 
global nodes comprising multiple said parallel nodes 
with a sufficient commonality in said ranges, said 
parallel nodes having said commonality in said ranges 
being therefore within the same said global node, said 
parallel nodes having a difference in said ranges 
between said parallel nodes being therefore within 
separate said global nodes where said difference pro- 
duces sufficient distinction between said sets, said 
parallel nodes within the same said global node stored 
on logically corresponding said memory units, 



04/01/2003, EAST Version: 1.03.0002 



6,i: 

45 

e. range relation rules that enable said computer system to 
logically relate said ranges and thereby relate said sets, 
said data and said parallel nodes, said range relation 
rules determining said commonality and said 
difference, 

whereby said ranges are logically related to each other 
thereby relating said sets, said parallel nodes, and the data 
values, and 

whereby the relations between said parallel nodes create a 
plurality of serial data structures linked by the commonality 
of ranges that defines said global nodes, thus expressing said 
ordering of data by creating the parallel or global data 
structure as a composite of said serial data structures, and 
thus providing parallel and global means to control the data 
structures. 

25. An article as recited in claim 24 further including: 

a. global node creation means that creates and defines said 
global node by grouping together said parallel nodes 
that are related by said commonality in ranges, 

b. global node relation rules that utilize said range relation 
rules to logically relate said global nodes to each other, 

whereby said computer system is enabled to globally 
manipulate said global nodes on said memory means. 

26. An article as recited in claim 25 further including: 

a. adjustment need rules that determine a need for adjust- 
ment to said ranges to maintain said order for said data, 

b. range adjustment rules that enable said computer sys- 
tem to adjust said ranges, changing the breadth of said 
ranges, 

wherein said range relation rules are used to adjust the 
logical relationships to appropriately relate the adjusted 
ranges, 

whereby said processing means are enabled to alter a first 
expression of said order to produce a second expression of 
said order while maintaining the rules that define said order 
for both of the expressions, and 

whereby changing the data organized in said order results in 
a change in a given expression of said order while main- 
taining the rules that define said order. 

27. An article as recited in claim 26 wherein said range 
relation rules further define said commonality to produce 
equivalent comparison results indicating said commonality 
when comparing one said parallel node within a given said 
global node to any of said parallel nodes within the same 
said global node, and further define said differences to 
produce equivalent comparison results indicating said dif- 
ferences when comparing one said parallel node within a 
given said global node to any of said parallel nodes within 
a separate said global node, and wherein said range adjust- 
ment rules contain range addition and removal rules to add 
new said ranges to said order and remove old said ranges 
from said order, adding new said parallel nodes and remov- 
ing old said parallel nodes as necessary, and adjusting the 
logical relationships as necessary. 

28. An article as recited in claim 26 wherein said com- 
puter system defines said order by using said commonality 
and said difference to create a parallel expression of a serial 
data structure with its own rules of ordering, thus defining 
said parallel data structure, said parallel data structure com- 
prising a plurality of separate said serial data structures 
related to each other by said commonality in ranges and 
configured by the rules of ordering said serial data struc- 
tures. 

29. An article as recited in claim 26 wherein said com- 
puter system defines said order by using said commonality 



8,123 

46 

and said difference to create a plurality of separate data 
structures stored separately on said memory units, each said 
separate data structure identical in configuration to each 
other said separate data structure. 

5 30. An article as recited in claim 27 wherein the range 
addition is accomplished by splitting said old range into two 
or more said new ranges, said new ranges being equal to said 
old range when combined, said new ranges defined such that 
each has an ordinal range relationship to each other, and 

10 wherein said range relation rules relate the ranges in said 
order to each other by said ordinal range relationship. 

31. An article as recited in claim 30 further including: 

a. find global node means by which said order is searched 
for a desired said range by using said range relation 

15 rules, 

b. add to global node means that adds the data values to 
said global nodes, 

c. remove from global node means that removes the data 
2Q values from said global nodes, 

wherein locating the desired range allows access to a proper 
said global node to contain a given value of said data, and 
upon the locating, the data values are added to or removed 
from the proper global node altering the global node con- 
^ tents as necessary, and said adjustment need rules determine 
if the alteration of the global node contents results in said 
need for adjustment, whereupon said ranges are adjusted and 
the relationships are altered as necessary. 

32. An article as recited in claim 31 wherein the logical 
30 relationships, said find global node means, the addition of 

new parallel nodes and the removal of old parallel nodes are 
adapted parallel versions of a search algorithm, logical 
relation rules, node or data addition rules and node or data 
removal rules of a serial data structure, 
35 whereby said parallel data structure is a parallel version of 
said serial data structure created and maintained in a parallel 
or distributed environment, said parallel data structure 
accomplishing the same goals as said serial data structure. 

33. An article as recited in claim 31 wherein said ranges 
^ are non overlapping ranges that relate to each other in the 

same fashion as the data values properly contained in said 
ranges such that said range relation rules express the simi- 
larity in relationships to create said parallel data structure. 

34. An article as recited in claim 24 wherein said range 
45 determination rules and said range relation rules create the 

range definitions with a wide variety of uses, the range 
definitions creating complex ranges, said complex ranges 
defining complex sets, said complex ranges calculated using 
said data in such a way that a given value of said data can 
50 have membership in multiple said complex sets, said mul- 
tiple complex sets intersecting each other where said mul- 
tiple complex sets contain the data value with membership 
in said multiple complex sets. 

35. An article as recited in claim 34 wherein said complex 
55 ranges are used to create said parallel data structure such that 

it has at least two dimensions, and wherein each said 
complex set is organized for access by a different aspect of 
the data values stored in said parallel data structure. 

36. An article as recited in claim 35 wherein said parallel 
60 nodes related by said commonality in ranges are said com- 
plex sets of said parallel nodes, and wherein each said 
complex set of parallel nodes creates a complex said global 
node, said complex global nodes related to each other by 
said range relation rules. 

***** 



04/01/2003, EAST Version: 1.03.0002 



